From stefan at sun.ac.za Sat May 1 15:17:56 2010 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 1 May 2010 21:17:56 +0200 Subject: [Numpy-discussion] ndimage.label - howto force SWIG to use int32 - even on 64bit Linux ? In-Reply-To: References: Message-ID: Hi Sebastian On 27 April 2010 10:27, Sebastian Haase wrote: > Hi, > I wanted to write some C code to accept labels as they come from ndimage.label. > For some reason ndimage.label produces its output as an int32 array - > even on my 64bit system . I've merged Thouis's patch to implement scikits.image.measurements in pure Python. Would you please try the SVN version and see if it solves your problem, and also whether it performs to your satisfaction? Regards St?fan From millman at berkeley.edu Sat May 1 16:19:54 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Sat, 1 May 2010 13:19:54 -0700 Subject: [Numpy-discussion] proposing a "beware of [as]matrix()" warning In-Reply-To: <6A5E3F22-85FF-48ED-ADF6-B02B260F294D@cs.toronto.edu> References: <4BD62D8D.8060401@cs.toronto.edu> <4BD85D63.5080808@student.matnat.uio.no> <4BD87EB3.6010606@american.edu> <6A5E3F22-85FF-48ED-ADF6-B02B260F294D@cs.toronto.edu> Message-ID: On Wed, Apr 28, 2010 at 2:46 PM, David Warde-Farley wrote: > Would it be acceptable to retain the matrix class but not have it imported in the default namespace, and have to import e.g. numpy.matlib to get at them? +1 Jarrod From gokhansever at gmail.com Sat May 1 16:36:34 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Sat, 1 May 2010 15:36:34 -0500 Subject: [Numpy-discussion] Question about numpy.arange() Message-ID: Hello, Is "b" an expected value? I am suspecting another floating point arithmetic issue. I[1]: a = np.arange(1.6, 1.8, 0.1, dtype='float32') I[2]: a O[2]: array([ 1.60000002, 1.70000005], dtype=float32) I[3]: b = np.arange(1.7, 1.8, 0.1, dtype='float32') I[4]: b O[4]: array([ 1.70000005, 1.79999995], dtype=float32) A bit conflicting with the np.arange docstring: "* Values are generated within the half-open interval ``[start, stop)`` (in other words, the interval including `start` but excluding `stop`). * " Thanks. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Sat May 1 20:57:11 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 1 May 2010 20:57:11 -0400 Subject: [Numpy-discussion] Question about numpy.arange() In-Reply-To: References: Message-ID: On 1 May 2010 16:36, G?khan Sever wrote: > Hello, > > Is "b" an expected value? I am suspecting another floating point arithmetic > issue. > > I[1]: a = np.arange(1.6, 1.8, 0.1, dtype='float32') > > I[2]: a > O[2]: array([ 1.60000002,? 1.70000005], dtype=float32) > > I[3]: b = np.arange(1.7, 1.8, 0.1, dtype='float32') > > I[4]: b > O[4]: array([ 1.70000005,? 1.79999995], dtype=float32) > > A bit conflicting with the np.arange docstring: > > "?? Values are generated within the half-open interval ``[start, stop)`` > ??? (in other words, the interval including `start` but excluding `stop`). " This is a floating-point issue; since 1.79999995 does not actually equal 1.8, it is included. This arises because 0.1, 1.7, and 1.8 cannot be exactly represented in floating-point. A good rule to avoid being annoyed by this is: only use arange for integers. Use linspace if you want floating-point. Anne > Thanks. > > -- > G?khan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From kwgoodman at gmail.com Sat May 1 20:59:49 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 1 May 2010 17:59:49 -0700 Subject: [Numpy-discussion] Question about numpy.arange() In-Reply-To: References: Message-ID: On Sat, May 1, 2010 at 1:36 PM, G?khan Sever wrote: > Hello, > > Is "b" an expected value? I am suspecting another floating point arithmetic > issue. > > I[1]: a = np.arange(1.6, 1.8, 0.1, dtype='float32') > > I[2]: a > O[2]: array([ 1.60000002,? 1.70000005], dtype=float32) > > I[3]: b = np.arange(1.7, 1.8, 0.1, dtype='float32') > > I[4]: b > O[4]: array([ 1.70000005,? 1.79999995], dtype=float32) > > A bit conflicting with the np.arange docstring: > > "?? Values are generated within the half-open interval ``[start, stop)`` > ??? (in other words, the interval including `start` but excluding `stop`). " Try np.linspace(). It works better with floats: >> np.linspace(1.7, 1.8, 2) array([ 1.7, 1.8]) From vincent at vincentdavis.net Sat May 1 22:31:30 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Sat, 1 May 2010 20:31:30 -0600 Subject: [Numpy-discussion] bug, issue with genfromtxt Message-ID: I ran into this issue and it was discussed on the pystatsmodels mailing list. Here is the setup Running on a Mac 10.6 Using Office 2008 Saving an spreadsheet using excel "save as" a csv file. Try to import using genfromtxt fails, report a EOL error I thought this was because the EOL was wrong, It seems the file has '\r' as the line ending (this may be wrong) anyway I changed it to '\n' and it works fine. I am told (on the pystatsmodels mailing list) that this is actually because the file is in unicode and that genfromtxt does not read the EOL correctly. To me it is a bug because one might expect a user to what to save a file from excel and read it using genfromtxt. And for useres with little experience the problem is not obvious. I guess this is not a problem with py3? ORIGINAL ATTEMPT datatype = [('date','|S9'),('gpd','i8'),('temp','i8' ('precip','f16')] data = np.genfromtxt('waterdata.csv', delimiter=',', skip_header=1, dtype=datatype) Traceback (most recent call last): File "/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py", line 1, in # Used internally for debug sandbox under external interpreter File "/Library/Frameworks/EPD64.framework/Versions/6.1/lib/ python2.6/site-packages/numpy/lib/io.py", line 1048, in genfromtxt raise IOError('End-of-file reached before encountering data.') IOError: End-of-file reached before encountering data. THIS DOES NOT WORK >>> s = file('data_with_CR.csv','r') >>> data = np.genfromtxt(s, delimiter=",", skip_header=1, dtype=None) Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/EPD64.framework/Versions/6.1/lib/python2.6/site-packages/numpy/lib/io.py", line 1048, in genfromtxt raise IOError('End-of-file reached before encountering data.') IOError: End-of-file reached before encountering data. >>> data = np.genfromtxt(s, delimiter=",", , dtype=None) File "", line 1 data = np.genfromtxt(s, delimiter=",", , dtype=None) THIS DOES WORK >>> s = file('data_with_CR.csv','U') >>> data = np.genfromtxt(s, delimiter=",", skip_header=1, dtype=None) >>> data array([('1/1/00', 8021472, 52, 0.02), ('1/2/00', 9496016, 46, 0.059999999999999998), ('1/3/00', 8478792, 29, 0.0), ..., ('12/29/02', 10790000, 61, 0.0), ('12/30/02', 9501000, 44, 0.0), ('12/31/02', 9288000, 53, 0.0)], dtype=[('f0', '|S8'), ('f1', ' | LinkedIn -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Sat May 1 22:43:56 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 01 May 2010 21:43:56 -0500 Subject: [Numpy-discussion] Question about numpy.arange() In-Reply-To: References: Message-ID: <4BDCE6EC.2070403@enthought.com> G?khan Sever wrote: > Hello, > > Is "b" an expected value? I am suspecting another floating point > arithmetic issue. Exactly. You'll see the same type of problem with float64, too: In [17]: z = np.arange(1.7, 1.8, 0.1) In [18]: z Out[18]: array([ 1.7, 1.8]) In [19]: z[1] == 1.8 Out[19]: True In [20]: z[1] - 1 Out[20]: 0.80000000000000004 Fun stuff, eh? To avoid problems like this, I generally use linspace instead of arange. Warren > > I[1]: a = np.arange(1.6, 1.8, 0.1, dtype='float32') > > I[2]: a > O[2]: array([ 1.60000002, 1.70000005], dtype=float32) > > I[3]: b = np.arange(1.7, 1.8, 0.1, dtype='float32') > > I[4]: b > O[4]: array([ 1.70000005, 1.79999995], dtype=float32) > > A bit conflicting with the np.arange docstring: > > "/ Values are generated within the half-open interval ``[start, stop)`` > (in other words, the interval including `start` but excluding > `stop`). /" > > Thanks. > > -- > G?khan > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Sat May 1 23:03:12 2010 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 1 May 2010 22:03:12 -0500 Subject: [Numpy-discussion] Question about numpy.arange() In-Reply-To: References: Message-ID: On Sat, May 1, 2010 at 15:36, G?khan Sever wrote: > Hello, > > Is "b" an expected value? I am suspecting another floating point arithmetic > issue. > > I[1]: a = np.arange(1.6, 1.8, 0.1, dtype='float32') > > I[2]: a > O[2]: array([ 1.60000002,? 1.70000005], dtype=float32) > > I[3]: b = np.arange(1.7, 1.8, 0.1, dtype='float32') > > I[4]: b > O[4]: array([ 1.70000005,? 1.79999995], dtype=float32) > > A bit conflicting with the np.arange docstring: > > "?? Values are generated within the half-open interval ``[start, stop)`` > ??? (in other words, the interval including `start` but excluding `stop`). " Not at all. 1.79999995 < 1.8 . However, also note this warning in the arange() docs: """ For floating point arguments, the length of the result is ``ceil((stop - start)/step)``. Because of floating point overflow, this rule may result in the last element of `out` being greater than `stop`. """ You probably want linspace() instead. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gginiu at gmail.com Sun May 2 06:13:21 2010 From: gginiu at gmail.com (Andrzej Giniewicz) Date: Sun, 2 May 2010 12:13:21 +0200 Subject: [Numpy-discussion] numpy + MKL and AMD64 CPU, recognized as 32 bit Message-ID: Hi, from what I've been reading, and testing myself, 64bit CPU's from AMD and actually it seems like almost all 64bit other than IA64, use em64t in MKL - but in Numpy (at least 1.4.1), this architecture is picked only for Xeon - though older Xeons were not 64 bit iirc - I'm talking about numpy/distutils/system_info.py file, mkl_info class, __init__ function. The fragment: from cpuinfo import cpu l = 'mkl' # use shared library if cpu.is_Itanium(): plt = '64' #l = 'mkl_ipf' elif cpu.is_Xeon(): plt = 'em64t' #l = 'mkl_em64t' else: plt = '32' #l = 'mkl_ia32' with this, my AMD64 that works with em64t (that's version I have installed and that passes all tests, that is MKL 10.2.5.035) is recognized as 32 bit. Wouldn't it be better to check for is_64bit instead of is_Xeon? Thanks in advance for hints, Andrzej. From gokhansever at gmail.com Sun May 2 13:51:45 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Sun, 2 May 2010 12:51:45 -0500 Subject: [Numpy-discussion] Question about numpy.arange() In-Reply-To: References: Message-ID: On Sat, May 1, 2010 at 3:36 PM, G?khan Sever wrote: > Hello, > > Is "b" an expected value? I am suspecting another floating point arithmetic > issue. > > I[1]: a = np.arange(1.6, 1.8, 0.1, dtype='float32') > > I[2]: a > O[2]: array([ 1.60000002, 1.70000005], dtype=float32) > > I[3]: b = np.arange(1.7, 1.8, 0.1, dtype='float32') > > I[4]: b > O[4]: array([ 1.70000005, 1.79999995], dtype=float32) > > A bit conflicting with the np.arange docstring: > > "* Values are generated within the half-open interval ``[start, stop)`` > (in other words, the interval including `start` but excluding `stop`). > *" > > Thanks. > > -- > G?khan > Fair enough explanations to use np.linspace instead. What was confusing me above while a[1] and b[0] shows 1.70000005 only "b" steps up to 1.79999995 which "a" can't. The following is another surprise output: I[5]: c = np.arange(0.4, 0.5, 0.1, dtype='float32') I[6]: c O[6]: array([ 0.40000001], dtype=float32) Anyways, a Slashdotter might have seen me asking these questions. I will go read What Every Programmer Should Know About Floating-Point Arithmeticarticle. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 3 01:22:31 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 2 May 2010 23:22:31 -0600 Subject: [Numpy-discussion] &*(*&$%# Macros Message-ID: Hi Travis, Could you remove this macro? It isn't py3k compliant because of the PyCobject and we have too many public macros already. Please make it part of the interface if it needs to be exposed, otherwise make it an inline function somewhere. #define PyDataType_GetDatetimeMetaData(descr) \ ((descr->metadata == NULL) ? NULL : \ ((PyArray_DatetimeMetaData *)(PyCObject_AsVoidPtr( \ PyDict_GetItemString(descr->metadata, NPY_METADATA_DTSTR))))) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From seb.haase at gmail.com Mon May 3 03:55:44 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Mon, 3 May 2010 09:55:44 +0200 Subject: [Numpy-discussion] ndimage.label - howto force SWIG to use int32 - even on 64bit Linux ? In-Reply-To: References: Message-ID: Hi St?fan, I have actually not been using scikits.image so far. Maybe I should give it a try. Are you saying that scikits.image.measurements has its own implementation of ndimage.label !? My original post was actually referring to a C++ function I wrote to calculate 2nd-order moments of labeled objects. I have that restricted to 3D contiguous data. scikits.image might already have had this function implementer in a general way ;-) Regards, Sebastian 2010/5/1 St?fan van der Walt : > Hi Sebastian > > On 27 April 2010 10:27, Sebastian Haase wrote: >> Hi, >> I wanted to write some C code to accept labels as they come from ndimage.label. >> For some reason ndimage.label produces its output as an int32 array - >> even on my 64bit system . > > I've merged Thouis's patch to implement scikits.image.measurements in > pure Python. ?Would you please try the SVN version and see if it > solves your problem, and also whether it performs to your > satisfaction? > > Regards > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From austin.bingham at gmail.com Mon May 3 06:23:26 2010 From: austin.bingham at gmail.com (Austin Bingham) Date: Mon, 3 May 2010 12:23:26 +0200 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL is too far reaching? Message-ID: Hi everyone, I've recently been developing a python module and C++ library in parallel, with core functionality in python and C++ largely just layered on top of the python (with boost.python.) In some cases, however, for performance reasons, the C++ API "reaches into" the python code via the C API, and this tends to happen very often with numpy-related code. As a result, I'm using the numpy C API a lot in my C++ library. To make a long story short, I'm finding that there are many places where I need to include numpy headers in my own headers (e.g. when a template class uses part of the numpy API.) If the symbol I want in my header is in ndarrayobject.h, it seems that I'm obligated to define PY_ARRAY_UNIQUE_SYMBOL because that file includes __multiarray_api.h. However, defining that macro in a header file seems like a bad idea because of potential conflicts with headers from other libraries. As a motivating example, I have a header which implements a type-mapping template for numpy types. It maps at compile time between the NPY_TYPES enum and actual C++ types. To get the NPY_TYPES definitions, I have to include arrayobject.h. Even though my header doesn't actually use the symbols that PY_ARRAY_UNIQUE_SYMBOL influences, it seems that I need to define it; even without it being defined, __multiarray_api.h leaves a PyArray_API definition in my header. So, am I missing something obvious? Or is there no way to access symbols like NPY_TYPES without pulling in the PyArray_API functions? If there isn't, would it be possible to restructure the headers so that the symbols affected by PY_ARRAY_UNIQUE_SYMBOL are separated from the others? Thanks. Austin From aisaac at american.edu Mon May 3 09:13:01 2010 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 03 May 2010 09:13:01 -0400 Subject: [Numpy-discussion] Question about numpy.arange() In-Reply-To: References: Message-ID: <4BDECBDD.8050905@american.edu> On 5/2/2010 1:51 PM, G?khan Sever wrote: > The following is another surprise output: > > I[5]: c = np.arange(0.4, 0.5, 0.1, dtype='float32') > [6]: c > O[6]: array([ 0.40000001], dtype=float32) >>> a = np.array([0.4,0.5,0.1], dtype='float32') >>> a[0] 0.40000001 >>> (a[1]-a[0])/a[2] 0.99999994 >>> np.ceil((a[1]-a[0])/a[2]) 1.0 The docstring states: For floating point arguments, the length of the result is ``ceil((stop - start)/step)``. hth, Alan Isaac From charlesr.harris at gmail.com Mon May 3 10:34:04 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 3 May 2010 08:34:04 -0600 Subject: [Numpy-discussion] Broken numpy tests in python3k Message-ID: They seem to be string/unicode related. 1) This error remains after replacing the disappeared built in function "file" with "open" ====================================================================== ERROR: test_universal_newline (test_io.TestLoadTxt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.1/site-packages/numpy/lib/tests/test_io.py", line 412, in test_universal_newline data = np.loadtxt(name) File "/usr/local/lib/python3.1/site-packages/numpy/lib/npyio.py", line 632, in loadtxt first_vals = split_line(first_line) File "/usr/local/lib/python3.1/site-packages/numpy/lib/npyio.py", line 610, in split_line line = line.split(comments)[0].strip() TypeError: Can't convert 'bytes' object to str implicitly 2) Not sure about this one, it doesn't show up for Python 2k ====================================================================== ERROR: test_type_check.TestDateTimeData.test_basic ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.1/site-packages/nose/case.py", line 177, in runTest self.test(*self.arg) File "/usr/local/lib/python3.1/site-packages/numpy/lib/tests/test_type_check.py", line 382, in test_basic assert_equal(datetime_data(a.dtype), ('us', 1, 1, 1)) File "/usr/local/lib/python3.1/site-packages/numpy/lib/type_check.py", line 641, in datetime_data result = func(obj) TypeError: this function takes at least 2 arguments (1 given) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Mon May 3 10:02:10 2010 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 03 May 2010 10:02:10 -0400 Subject: [Numpy-discussion] incremental histogram Message-ID: I have coded in c++ a histogram object that can be used as: h += my_sample or h += my_vector This is very useful in simulations which are looping and developing results incrementally. It would me great to have such a feature in numpy. From oliphant at enthought.com Mon May 3 11:23:51 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Mon, 3 May 2010 11:23:51 -0400 Subject: [Numpy-discussion] &*(*&$%# Macros In-Reply-To: References: Message-ID: <50924C14-A74F-4C4C-985E-FE8CDD2D20A0@enthought.com> Perhaps as part of the refactoring. But I am not sure how to quantify too many public macros. This macro may need to be exposed. -- (mobile phone of) Travis Oliphant Enthought, Inc. 1-512-536-1057 http://www.enthought.com On May 3, 2010, at 1:22 AM, Charles R Harris wrote: > Hi Travis, > > Could you remove this macro? It isn't py3k compliant because of the > PyCobject and we have too many public macros already. Please make it > part of the interface if it needs to be exposed, otherwise make it > an inline function somewhere. > > #define PyDataType_GetDatetimeMetaData > (descr) \ > ((descr->metadata == NULL) ? > NULL : \ > ((PyArray_DatetimeMetaData *)(PyCObject_AsVoidPtr > ( \ > PyDict_GetItemString(descr->metadata, > NPY_METADATA_DTSTR))))) > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Mon May 3 13:14:05 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 3 May 2010 11:14:05 -0600 Subject: [Numpy-discussion] &*(*&$%# Macros In-Reply-To: <50924C14-A74F-4C4C-985E-FE8CDD2D20A0@enthought.com> References: <50924C14-A74F-4C4C-985E-FE8CDD2D20A0@enthought.com> Message-ID: On Mon, May 3, 2010 at 9:23 AM, Travis Oliphant wrote: > Perhaps as part of the refactoring. But I am not sure how to quantify > too many public macros. This macro may need to be exposed. > > It doesn't check for errors and the original was all on a single line, way over the limit. It needs to be a function. If *you* don't know if it needs be exposed it probably shouldn't be exposed. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue May 4 01:05:25 2010 From: cournape at gmail.com (David Cournapeau) Date: Tue, 4 May 2010 14:05:25 +0900 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL is too far reaching? In-Reply-To: References: Message-ID: On Mon, May 3, 2010 at 7:23 PM, Austin Bingham wrote: > Hi everyone, > > I've recently been developing a python module and C++ library in > parallel, with core functionality in python and C++ largely just > layered on top of the python (with boost.python.) In some cases, > however, for performance reasons, the C++ API "reaches into" the > python code via the C API, and this tends to happen very often with > numpy-related code. > > As a result, I'm using the numpy C API a lot in my C++ library. To > make a long story short, I'm finding that there are many places where > I need to include numpy headers in my own headers (e.g. when a > template class uses part of the numpy API.) If the symbol I want in my > header is in ndarrayobject.h, it seems that I'm obligated to define > PY_ARRAY_UNIQUE_SYMBOL because that file includes __multiarray_api.h. > However, defining that macro in a header file seems like a bad idea > because of potential conflicts with headers from other libraries. You don't need to define PY_ARRAY_UNIQUE_SYMBOL to include any public numpy header - it seems that you are trying to avoid getting PyArray_API defined, but I don't understand why. PY_ARRAY_UNIQUE_SYMBOL should only be used when you want to split your extension into separately compilation units (object files). David From stefan at sun.ac.za Tue May 4 01:55:24 2010 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 4 May 2010 07:55:24 +0200 Subject: [Numpy-discussion] ndimage.label - howto force SWIG to use int32 - even on 64bit Linux ? In-Reply-To: References: Message-ID: Hi Sebastian On 3 May 2010 09:55, Sebastian Haase wrote: > I have actually not been using scikits.image so far. > Maybe I should give it a try. > Are you saying that scikits.image.measurements has its own > implementation of ndimage.label !? Sorry, I wasn't focusing: I merged his patch into scipy.ndimage. Regards St?fan From charlesr.harris at gmail.com Tue May 4 02:15:43 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 4 May 2010 00:15:43 -0600 Subject: [Numpy-discussion] datetime failure on py3k, string type issue. Message-ID: The following fails after fixing datetime_data assert_equal(datetime_data(a.dtype), ('us', 1, 1, 1)) The problem is that 'us' is unicode and the function call yields bytes. The question is: should datetime units use unicode when compiled on python >= 3k? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From austin.bingham at gmail.com Tue May 4 03:38:52 2010 From: austin.bingham at gmail.com (Austin Bingham) Date: Tue, 4 May 2010 09:38:52 +0200 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL is too far reaching? In-Reply-To: References: Message-ID: On Tue, May 4, 2010 at 7:05 AM, David Cournapeau wrote: > On Mon, May 3, 2010 at 7:23 PM, Austin Bingham wrote: >> Hi everyone, >> >> I've recently been developing a python module and C++ library in >> parallel, with core functionality in python and C++ largely just >> layered on top of the python (with boost.python.) In some cases, >> however, for performance reasons, the C++ API "reaches into" the >> python code via the C API, and this tends to happen very often with >> numpy-related code. >> >> As a result, I'm using the numpy C API a lot in my C++ library. To >> make a long story short, I'm finding that there are many places where >> I need to include numpy headers in my own headers (e.g. when a >> template class uses part of the numpy API.) If the symbol I want in my >> header is in ndarrayobject.h, it seems that I'm obligated to define >> PY_ARRAY_UNIQUE_SYMBOL because that file includes __multiarray_api.h. >> However, defining that macro in a header file seems like a bad idea >> because of potential conflicts with headers from other libraries. > > You don't need to define PY_ARRAY_UNIQUE_SYMBOL to include any public > numpy header - it seems that you are trying to avoid getting > PyArray_API defined, but I don't understand why. > > PY_ARRAY_UNIQUE_SYMBOL should only be used when you want to split your > extension into separately compilation units (object files). I admit I'm having trouble formulating questions to address my problems, so please bear with me. Say I've got a shared library of utilities for working with numpy arrays. It's intended to be used in multiple extension modules and in some places that are not modules at all (e.g. C++ programs that embed python and want to manipulate arrays directly.) One of the headers in this library (call it 'util.h') includes arrayobject.h because, for example, it needs NPY_TYPES in some template definitions. Should this 'util.h' define PY_ARRAY_UNIQUE_SYMBOL? Or NO_IMPORT? It seems like the correct answers are 'no' and 'yes', but that means that any user of this header needs to be very aware of header inclusion order. For example, if they want to include 'arrayobject.h' for their own reasons *and* they want NO_IMPORT undefined, then they need to be sure to include 'util.h' after 'arrayobject.h'. >From what I can see, the problem seems to be a conflation of two sets of symbols: those influenced by the PY_ARRAY_UNIQUE_SYMBOL and NO_IMPORT macros (broadly, the API functions), those that aren't (types, enums, and so forth.) As things stand, there's no way for 'util.h' to use NPY_TYPES (part of the latter set) without affecting users of the former set. It seems easy enough to make the latter set available in their own header to help avoid these kinds of problem. Austin From denis-bz-py at t-online.de Tue May 4 07:57:30 2010 From: denis-bz-py at t-online.de (denis) Date: Tue, 04 May 2010 13:57:30 +0200 Subject: [Numpy-discussion] incremental histogram In-Reply-To: References: Message-ID: On 03/05/2010 16:02, Neal Becker wrote: > I have coded in c++ a histogram object that can be used as: > > h += my_sample > > or > > h += my_vector > > This is very useful in simulations which are looping and developing results > incrementally. It would me great to have such a feature in numpy. Neal, I like the idea of a faster np.histogram / histogramdd; but it would have to be compatible with numpy and pylab or at least a clear, documented subset (doc first). Some Wibnis, wouldn't it be nice ifs, for WibniHistogram: - gui with realtime zoom / upsample / smooth: must exist, physicists ? - adaptive binning, e.g. percentiles then uniform - interpolate: fill holes, then *linear or spline += data is nice, but seems orthogonal to histogramming -- why not just subclass histogram ? cheers -- denis From ndbecker2 at gmail.com Tue May 4 08:09:53 2010 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 04 May 2010 08:09:53 -0400 Subject: [Numpy-discussion] incremental histogram References: Message-ID: denis wrote: > On 03/05/2010 16:02, Neal Becker wrote: >> I have coded in c++ a histogram object that can be used as: >> >> h += my_sample >> >> or >> >> h += my_vector >> >> This is very useful in simulations which are looping and developing >> results >> incrementally. It would me great to have such a feature in numpy. > > Neal, > I like the idea of a faster np.histogram / histogramdd; > but it would have to be compatible with numpy and pylab > or at least a clear, documented subset (doc first). The point is not to be faster, it's to be incremental. > > Some Wibnis, wouldn't it be nice ifs, for WibniHistogram: > - gui with realtime zoom / upsample / smooth: must exist, physicists ? > - adaptive binning, e.g. percentiles then uniform > - interpolate: fill holes, then *linear or spline > > += data is nice, but seems orthogonal to histogramming -- > why not just subclass histogram ? > I thought np histogram was a function, not a class? To be incremental, it has to have state, and so should be a class. From denis-bz-py at t-online.de Tue May 4 11:02:28 2010 From: denis-bz-py at t-online.de (denis) Date: Tue, 04 May 2010 17:02:28 +0200 Subject: [Numpy-discussion] incremental histogram In-Reply-To: References: Message-ID: On 04/05/2010 14:09, Neal Becker wrote: > denis wrote: >> Neal, >> I like the idea of a faster np.histogram / histogramdd; >> but it would have to be compatible with numpy and pylab >> or at least a clear, documented subset (doc first). > > The point is not to be faster, it's to be incremental. OK, different points: I'd like it to be very fast and leverage it >> Some Wibnis, wouldn't it be nice ifs, for WibniHistogram: >> - gui with realtime zoom / upsample / smooth: must exist, physicists ? >> - adaptive binning, e.g. percentiles then uniform >> - interpolate: fill holes, then *linear or spline Do any of these make sense / resonate ? >> += data is nice, but seems orthogonal to histogramming -- >> why not just subclass histogram ? >> > > I thought np histogram was a function, not a class? To be incremental, it > has to have state, and so should be a class. Yes you're right. Is it worth making into a class, with C or Cython ? From sccolbert at gmail.com Tue May 4 12:20:44 2010 From: sccolbert at gmail.com (S. Chris Colbert) Date: Tue, 04 May 2010 12:20:44 -0400 Subject: [Numpy-discussion] Poll: Semantics for % in Cython In-Reply-To: <49B95BA4.8010800@student.matnat.uio.no> References: <49B95BA4.8010800@student.matnat.uio.no> Message-ID: <1272990044.1977.2.camel@broo> On Thu, 2009-03-12 at 19:59 +0100, Dag Sverre Seljebotn wrote: > (First off, is it OK to continue polling the NumPy list now and then on > Cython language decisions? Or should I expect that any interested Cython > users follow the Cython list?) > > In Python, if I write "-1 % 5", I get 4. However, in C if I write "-1 % > 5" I get -1. The question is, what should I get in Cython if I write (a > % b) where a and b are cdef ints? Should I > > [ ] Get 4, because it should behave just like in Python, avoiding > surprises when adding types to existing algorithms (this will require > extra logic and be a bit slower) > > [ ] Get -1, because they're C ints, and besides one isn't using > Cython if one doesn't care about performance > > Whatever we do, this also affects the division operator, so that one in > any case will have a==(a//b)*b+a%b. > > (Orthogonal to this, we can introduce compiler directives to change the > meaning of the operator from the default in a code blocks, and/or make > special functions for the semantics that are not chosen as default.) > I definitely fall into the "I prefer C semantics" crowd. Because my brain is in "C" mode whenever I write Cython. However, I totally understand the arguments from the other side, and I would not be upset if Cython went in that direction. You could say that I have my preference, but I can't make a strong argument for it. Chris From sccolbert at gmail.com Tue May 4 12:26:17 2010 From: sccolbert at gmail.com (Chris Colbert) Date: Tue, 4 May 2010 12:26:17 -0400 Subject: [Numpy-discussion] Poll: Semantics for % in Cython In-Reply-To: <1272990044.1977.2.camel@broo> References: <49B95BA4.8010800@student.matnat.uio.no> <1272990044.1977.2.camel@broo> Message-ID: On Tue, May 4, 2010 at 12:20 PM, S. Chris Colbert wrote: > On Thu, 2009-03-12 at 19:59 +0100, Dag Sverre Seljebotn wrote: > > (First off, is it OK to continue polling the NumPy list now and then on > > Cython language decisions? Or should I expect that any interested Cython > > users follow the Cython list?) > > > > In Python, if I write "-1 % 5", I get 4. However, in C if I write "-1 % > > 5" I get -1. The question is, what should I get in Cython if I write (a > > % b) where a and b are cdef ints? Should I > > > > [ ] Get 4, because it should behave just like in Python, avoiding > > surprises when adding types to existing algorithms (this will require > > extra logic and be a bit slower) > > > > [ ] Get -1, because they're C ints, and besides one isn't using > > Cython if one doesn't care about performance > > > > Whatever we do, this also affects the division operator, so that one in > > any case will have a==(a//b)*b+a%b. > > > > (Orthogonal to this, we can introduce compiler directives to change the > > meaning of the operator from the default in a code blocks, and/or make > > special functions for the semantics that are not chosen as default.) > > > > I definitely fall into the "I prefer C semantics" crowd. Because my > brain is in "C" mode whenever I write Cython. However, I totally > understand the arguments from the other side, and I would not be upset > if Cython went in that direction. > > You could say that I have my preference, but I can't make a strong > argument for it. > > Chris > > It seems I was a little late to the party. The mail client's sort-by-date was reversed. My apologies. -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Tue May 4 14:51:37 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 4 May 2010 11:51:37 -0700 Subject: [Numpy-discussion] Adding an ndarray.dot method In-Reply-To: References: <4BD62D8D.8060401@cs.toronto.edu> <55C0F57B-A246-4D16-B6EF-5C6F58CEEA37@enthought.com> <4BD87A77.9030108@american.edu> Message-ID: On Thu, Apr 29, 2010 at 12:30 PM, Pauli Virtanen wrote: > Wed, 28 Apr 2010 14:12:07 -0400, Alan G Isaac wrote: > [clip] > > Here is a related ticket that proposes a more explicit alternative: > > adding a ``dot`` method to ndarray. > > http://projects.scipy.org/numpy/ticket/1456 > > I kind of like this idea. Simple, obvious, and leads > to clear code: > > a.dot(b).dot(c) > > or in another multiplication order, > > a.dot(b.dot(c)) > > And here's an implementation: > > > http://github.com/pv/numpy-work/commit/414429ce0bb0c4b7e780c4078c5ff71c113050b6 > > I think I'm going to apply this, unless someone complains, I have a big one: NO DOCSTRING!!! We're just perpetuating the errors of the past people! Very discouraging! DG > as I > don't see any downsides (except maybe adding one more to the > huge list of methods ndarray already has). > > Cheers, > Pauli > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gberbeglia at gmail.com Tue May 4 16:06:52 2010 From: gberbeglia at gmail.com (gerardob) Date: Tue, 4 May 2010 13:06:52 -0700 (PDT) Subject: [Numpy-discussion] Improvement of performance Message-ID: <28452458.post@talk.nabble.com> Hello, I have written a very simple code that computes the gradient by finite differences of any general function. Keeping the same idea, I would like modify the code using numpy to make it faster. Any ideas? Thanks. def grad_finite_dif(self,x,user_data = None): assert len(x) == self.number_variables points=[] for j in range(self.number_variables): points.append(x.copy()) points[len(points)-1][j]=points[len(points)-1][j]+0.0000001 delta_f = [] counter=0 for j in range(self.number_variables): delta_f.append((self.eval(points[counter])-self.eval(x))/0.0000001) counter = counter + 1 return array(delta_f) -- View this message in context: http://old.nabble.com/Improvement-of-performance-tp28452458p28452458.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From josef.pktd at gmail.com Tue May 4 16:17:22 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 4 May 2010 16:17:22 -0400 Subject: [Numpy-discussion] Improvement of performance In-Reply-To: <28452458.post@talk.nabble.com> References: <28452458.post@talk.nabble.com> Message-ID: On Tue, May 4, 2010 at 4:06 PM, gerardob wrote: > > Hello, I have written a very simple code that computes the gradient by finite > differences of any general function. Keeping the same idea, I would like > modify the code using numpy to make it faster. > Any ideas? > Thanks. > > ? ? ? def grad_finite_dif(self,x,user_data = None): > ? ? ? ? ? ? ? ?assert len(x) == self.number_variables > ? ? ? ? ? ? ? ?points=[] > ? ? ? ? ? ? ? ?for j in range(self.number_variables): > ? ? ? ? ? ? ? ? ? ? ? ?points.append(x.copy()) > ? ? ? ? ? ? ? ? ? ? ? ?points[len(points)-1][j]=points[len(points)-1][j]+0.0000001 > ? ? ? ? ? ? ? ?delta_f = [] > ? ? ? ? ? ? ? ?counter=0 > ? ? ? ? ? ? ? ?for j in range(self.number_variables): > ? ? ? ? ? ? ? ? ? ? ? ?delta_f.append((self.eval(points[counter])-self.eval(x))/0.0000001) it looks like your are evaluating the same point several times self.eval(x) > ? ? ? ? ? ? ? ? ? ? ? ?counter = counter + 1 > ? ? ? ? ? ? ? ?return array(delta_f) That's what I used as a pattern for a gradient function #from scipy.optimize def approx_fprime(xk,f,epsilon,*args): f0 = f(*((xk,)+args)) grad = np.zeros((len(xk),), float) ei = np.zeros((len(xk),), float) for k in range(len(xk)): ei[k] = epsilon grad[k] = (f(*((xk+ei,)+args)) - f0)/epsilon ei[k] = 0.0 return grad Josef > -- > View this message in context: http://old.nabble.com/Improvement-of-performance-tp28452458p28452458.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From lasagnadavide at gmail.com Tue May 4 17:36:19 2010 From: lasagnadavide at gmail.com (Davide Lasagna) Date: Tue, 4 May 2010 23:36:19 +0200 Subject: [Numpy-discussion] Improvement of performance In-Reply-To: References: <28452458.post@talk.nabble.com> Message-ID: If your x data are equispaced I would do something like this def derive( func, x): """ Approximate the first derivative of function func at points x. """ # compute the values of y = func(x) y = func(x) # compute the step dx = x[1] - x[0] # kernel array for second order accuracy centered first derivative kc = np.array([-1.0, +0.0, +1.0]) / 2 / dx # kernel array for second order accuracy left and right first derivative kl = np.array([-3.0, +4.0, -1.0]) / 2 / dx kr = np.array([+1.0, -4.0, +3.0]) / 2 / dx # correlate it with the original array, # note that only the valid computation are performed derivs_c = np.correlate( y, kc, mode='valid' ) derivs_r = np.correlate( y[-3:], kr, mode='valid' ) derivs_l = np.correlate( y[:+3], kl, mode='valid' ) return np.r_[derivs_l, derivs_c, derivs_r] This is actually quite fast: on my machine (1.7GHz) i have this. >>>:x = np.linspace(0,2*np.pi, 1e6) >>>:func = lambda x: np.sin(x) >>>:timeit derive(func, x) 10 loops, best of 3: 177 ms per loop I'm curious if someone comes up with something faster. Regards, Davide On 4 May 2010 22:17, wrote: > On Tue, May 4, 2010 at 4:06 PM, gerardob wrote: > > > > Hello, I have written a very simple code that computes the gradient by > finite > > differences of any general function. Keeping the same idea, I would like > > modify the code using numpy to make it faster. > > Any ideas? > > Thanks. > > > > def grad_finite_dif(self,x,user_data = None): > > assert len(x) == self.number_variables > > points=[] > > for j in range(self.number_variables): > > points.append(x.copy()) > > > points[len(points)-1][j]=points[len(points)-1][j]+0.0000001 > > delta_f = [] > > counter=0 > > for j in range(self.number_variables): > > > delta_f.append((self.eval(points[counter])-self.eval(x))/0.0000001) > > it looks like your are evaluating the same point several times self.eval(x) > > > counter = counter + 1 > > return array(delta_f) > > That's what I used as a pattern for a gradient function > > #from scipy.optimize > def approx_fprime(xk,f,epsilon,*args): > f0 = f(*((xk,)+args)) > grad = np.zeros((len(xk),), float) > ei = np.zeros((len(xk),), float) > for k in range(len(xk)): > ei[k] = epsilon > grad[k] = (f(*((xk+ei,)+args)) - f0)/epsilon > ei[k] = 0.0 > return grad > > Josef > > > -- > > View this message in context: > http://old.nabble.com/Improvement-of-performance-tp28452458p28452458.html > > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.walter at gmail.com Tue May 4 17:57:15 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Tue, 4 May 2010 23:57:15 +0200 Subject: [Numpy-discussion] Improvement of performance In-Reply-To: References: <28452458.post@talk.nabble.com> Message-ID: playing devil's advocate I'd say use Algorithmic Differentiation instead of finite differences ;) that would probably speed things up quite a lot. On Tue, May 4, 2010 at 11:36 PM, Davide Lasagna wrote: > If your x data are equispaced I would do something like this > def derive( func, x): > """ > ?? ? ? ?Approximate the first derivative of ?function func at points x. > """ > # compute the values of y = func(x) > y = func(x) > # compute the step > dx = x[1] - x[0] > # kernel array for second order accuracy centered first derivative > kc = np.array([-1.0, +0.0, +1.0]) / 2 / dx > # kernel array for second order accuracy left and right first derivative > kl = np.array([-3.0, +4.0, -1.0]) / 2 / dx > kr = np.array([+1.0, -4.0, +3.0]) / 2 / dx > # correlate it with the original array, > # note that only the valid computation are performed > derivs_c = np.correlate( y, kc, mode='valid' ?) > derivs_r = np.correlate( y[-3:], kr, mode='valid' ?) > derivs_l = np.correlate( y[:+3], kl, mode='valid' ?) > return np.r_[derivs_l, derivs_c, derivs_r] > This is actually quite fast: on my machine (1.7GHz) i have this. >>>>:x = np.linspace(0,2*np.pi, 1e6) >>>>:func = lambda x: np.sin(x) >>>>:timeit derive(func, x) > 10 loops, best of 3: 177 ms per loop > I'm curious if someone comes up with something faster. > > Regards, > Davide > > On 4 May 2010 22:17, wrote: >> >> On Tue, May 4, 2010 at 4:06 PM, gerardob wrote: >> > >> > Hello, I have written a very simple code that computes the gradient by >> > finite >> > differences of any general function. Keeping the same idea, I would like >> > modify the code using numpy to make it faster. >> > Any ideas? >> > Thanks. >> > >> > ? ? ? def grad_finite_dif(self,x,user_data = None): >> > ? ? ? ? ? ? ? ?assert len(x) == self.number_variables >> > ? ? ? ? ? ? ? ?points=[] >> > ? ? ? ? ? ? ? ?for j in range(self.number_variables): >> > ? ? ? ? ? ? ? ? ? ? ? ?points.append(x.copy()) >> > >> > ?points[len(points)-1][j]=points[len(points)-1][j]+0.0000001 >> > ? ? ? ? ? ? ? ?delta_f = [] >> > ? ? ? ? ? ? ? ?counter=0 >> > ? ? ? ? ? ? ? ?for j in range(self.number_variables): >> > >> > ?delta_f.append((self.eval(points[counter])-self.eval(x))/0.0000001) >> >> it looks like your are evaluating the same point several times >> self.eval(x) >> >> > ? ? ? ? ? ? ? ? ? ? ? ?counter = counter + 1 >> > ? ? ? ? ? ? ? ?return array(delta_f) >> >> That's what I used as a pattern for a gradient function >> >> #from scipy.optimize >> def approx_fprime(xk,f,epsilon,*args): >> ? ?f0 = f(*((xk,)+args)) >> ? ?grad = np.zeros((len(xk),), float) >> ? ?ei = np.zeros((len(xk),), float) >> ? ?for k in range(len(xk)): >> ? ? ? ?ei[k] = epsilon >> ? ? ? ?grad[k] = (f(*((xk+ei,)+args)) - f0)/epsilon >> ? ? ? ?ei[k] = 0.0 >> ? ?return grad >> >> Josef >> >> > -- >> > View this message in context: >> > http://old.nabble.com/Improvement-of-performance-tp28452458p28452458.html >> > Sent from the Numpy-discussion mailing list archive at Nabble.com. >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From guilherme at gpfreitas.com Tue May 4 20:23:31 2010 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Tue, 4 May 2010 17:23:31 -0700 Subject: [Numpy-discussion] Improvement of performance In-Reply-To: References: <28452458.post@talk.nabble.com> Message-ID: On Tue, May 4, 2010 at 2:57 PM, Sebastian Walter wrote: > playing devil's advocate I'd say use Algorithmic Differentiation > instead of finite differences ;) > that would probably speed things up quite a lot. I would suggest that too, but aside from FuncDesigner[0] (reference in the end), I couldn't find any Automatic Differentiation tool that was easy to install for Python. To stay with simple solutions, I think that the "complex step" approximation gives you a very good compromise between ease of use, performance and accuracy. Here is an implementation (and you can take ideas from it to get rid of your "for" loops in your original code) import numpy as np def complex_step_grad(f, x, h=1.0e-20): dim = np.size(x) increments = np.identity(dim) * 1j * h partials = [f(x+ih).imag / h for ih in increments] return np.array(partials) **Warning**: you must convert your original real-valued function f: R^n -> R to the corresponding complex function f: C^n -> C. Use functions from the 'cmath' module. I strongly suggest that you take a look at the AutoDiff website[1] and at some references about the complex step [2][3] (or just Google "complex step" and "differentiation", both on normal Google and Google Scholar). [0] http://openopt.org/FuncDesigner [1] http://www.autodiff.org/ [2] http://doi.acm.org/10.1145/838250.838251 [3] http://mdolab.utias.utoronto.ca/resources/complex-step -- Guilherme P. de Freitas http://www.gpfreitas.com From guilherme at gpfreitas.com Tue May 4 20:47:45 2010 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Tue, 4 May 2010 17:47:45 -0700 Subject: [Numpy-discussion] Improvement of performance In-Reply-To: References: <28452458.post@talk.nabble.com> Message-ID: I forgot to mention one thing: if you are doing optimization, a good solution is a modeling package like AMPL (or GAMS or AIMMS, but I only know AMPL, so I will restrict my attention to it). AMPL has a natural modeling language and provides you with automatic differentiation. It's not free, but there are trial licenses (60 days) and student a student edition (unlimited time, maximum of 300 variables). It is hooked to many great solvers, including KNITRO (commercial) and IPOPT (free), both for nonlinear programs. http://www.ampl.com And as this is a Python list, there is NLPy. I never tried it, but it seems to allow you to read model files written in AMPL, use AMPL's automatic differentiation capabilities, and still roll your own optimization algorithm, all in Python. It looks like it requires some other packages aside from Python, NumPy and AMPL, like a sparse linear solver and some other things. Worth taking a look. http://nlpy.sourceforge.net/how.html Best, Guilherme On Tue, May 4, 2010 at 5:23 PM, Guilherme P. de Freitas wrote: > On Tue, May 4, 2010 at 2:57 PM, Sebastian Walter > wrote: >> playing devil's advocate I'd say use Algorithmic Differentiation >> instead of finite differences ;) >> that would probably speed things up quite a lot. > > I would suggest that too, but aside from FuncDesigner[0] (reference in > the end), I couldn't find any Automatic Differentiation tool that was > easy to install for Python. > > To stay with simple solutions, I think that the "complex step" > approximation gives you a very good compromise between ease of use, > performance and accuracy. ?Here is an implementation (and you can take > ideas from it to get rid of your "for" loops in your original code) > > import numpy as np > > def complex_step_grad(f, x, h=1.0e-20): > ? ?dim = np.size(x) > ? ?increments = np.identity(dim) * 1j * h > ? ?partials = [f(x+ih).imag / h for ih in increments] > ? ?return np.array(partials) > > **Warning**: you must convert your original real-valued function f: > R^n -> R to the corresponding complex function f: C^n -> C. Use > functions from the 'cmath' module. > > I strongly suggest that you take a look at the AutoDiff website[1] and > at some references about the complex step [2][3] (or just Google > "complex step" and "differentiation", both on normal Google and Google > Scholar). > > [0] http://openopt.org/FuncDesigner > [1] http://www.autodiff.org/ > [2] http://doi.acm.org/10.1145/838250.838251 > [3] http://mdolab.utias.utoronto.ca/resources/complex-step > > > -- > Guilherme P. de Freitas > http://www.gpfreitas.com > -- Guilherme P. de Freitas http://www.gpfreitas.com From gokhansever at gmail.com Tue May 4 21:38:04 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 4 May 2010 20:38:04 -0500 Subject: [Numpy-discussion] Question about numpy.ma masking Message-ID: Hello, I have the following arrays read as masked array. I[10]: basic.data['Air_Temp'].mask O[10]: array([ True, False, False, ..., False, False, False], dtype=bool) [12]: basic.data['Press_Alt'].mask O[12]: False I[13]: len basic.data['Air_Temp'] -----> len(basic.data['Air_Temp']) O[13]: 1758 The first item data['Air_Temp'] has only the first element masked and this result with mask attribute being created an equal data length bool array. On the other hand data['Press_Alt'] has no elements to mask yielding a 'False' scalar. Is this a documented behavior or intentionally designed this way? This is the only case out of 20 that breaks my code as following: :) IndexError Traceback (most recent call last) 130 for k in range(len(shorter)): 131 if (serialh.data['dccnTempSF'][k] != 0) \ --> 132 and (basic.data['Air_Temp'].mask[k+diff] == False): 133 dccnConAmb[k] = serialc.data['dccnConc'][k] * \ 134 physical.data['STATIC_PR'][k+diff] * \ IndexError: invalid index to scalar variable. since mask is a scalar in this case, nothing to loop terminating with an IndexError. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed May 5 00:23:42 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 5 May 2010 00:23:42 -0400 Subject: [Numpy-discussion] Improvement of performance In-Reply-To: References: <28452458.post@talk.nabble.com> Message-ID: On Tue, May 4, 2010 at 8:23 PM, Guilherme P. de Freitas wrote: > On Tue, May 4, 2010 at 2:57 PM, Sebastian Walter > wrote: >> playing devil's advocate I'd say use Algorithmic Differentiation >> instead of finite differences ;) >> that would probably speed things up quite a lot. > > I would suggest that too, but aside from FuncDesigner[0] (reference in > the end), I couldn't find any Automatic Differentiation tool that was > easy to install for Python. > > To stay with simple solutions, I think that the "complex step" > approximation gives you a very good compromise between ease of use, > performance and accuracy. ?Here is an implementation (and you can take > ideas from it to get rid of your "for" loops in your original code) > > import numpy as np > > def complex_step_grad(f, x, h=1.0e-20): > ? ?dim = np.size(x) > ? ?increments = np.identity(dim) * 1j * h > ? ?partials = [f(x+ih).imag / h for ih in increments] > ? ?return np.array(partials) > > **Warning**: you must convert your original real-valued function f: > R^n -> R to the corresponding complex function f: C^n -> C. Use > functions from the 'cmath' module. Interesting idea I tried it with some silly function that has special.gammaln and np.dot in it, and it seems to work without adjustments to the code. The precision is much better than simple forward differentiation, which I have only with 1e-5 accuracy. simple timing 40% slower than simple forward differentiation 50%-80% faster that forward and backward differentiation In [2] I didn't see anything about higher derivatives, so to get the Hessian I still had to do a finite difference (Jacobian) on the complex_step_grad. Even then the results look pretty good. Another recommendation especially to check whether the results are correct: http://pypi.python.org/pypi/Numdifftools/ is pure python, (optionally adaptive) finite difference method for Gradient, Jacobian and Hessian. And related: I was surprised that sympy knows what the analytical derivative of the gamma function is, and that the function is available in scipy.special, so one less reason not to use analytical derivatives. I will include complex_step_grad in the toolbox for Maximum Likelihood Estimation. Thanks for the information, Josef > > I strongly suggest that you take a look at the AutoDiff website[1] and > at some references about the complex step [2][3] (or just Google > "complex step" and "differentiation", both on normal Google and Google > Scholar). > > [0] http://openopt.org/FuncDesigner > [1] http://www.autodiff.org/ > [2] http://doi.acm.org/10.1145/838250.838251 > [3] http://mdolab.utias.utoronto.ca/resources/complex-step > > > -- > Guilherme P. de Freitas > http://www.gpfreitas.com > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From guilherme at gpfreitas.com Wed May 5 02:52:05 2010 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Tue, 4 May 2010 23:52:05 -0700 Subject: [Numpy-discussion] Improvement of performance In-Reply-To: References: <28452458.post@talk.nabble.com> Message-ID: On Tue, May 4, 2010 at 9:23 PM, wrote: > In [2] I didn't see anything about higher derivatives, so to get the > Hessian I still had to do a finite difference (Jacobian) on the > complex_step_grad. Even then the results look pretty good. Yes, the traditional complex step does not solve the second derivatives problem. I think there are papers that try to address that (I don't know the literature though). But I have seen a paper [0] that extends the notion of complex number to multi-complex numbers, and then using the algebra of those multi-complex numbers the authors obtain higher order derivatives. It lacks the convenience of the complex step (in the sense that complex numbers are already implemented), but it's something to keep in mind. [0] http://soliton.ae.gatech.edu/people/rrussell/FinalPublications/ConferencePapers/2010Feb_SanDiego_AAS-10-218_mulicomplex.pdf But if we were to go that way (defining new numbers), maybe (I'm no expert in this area, some of the proponents of AD via dual numbers say it would be a better idea) it would be better to define dual numbers. They are like complex numbers, but their "imaginary component" (I don't think it's called that) d has the property that d^2 = 0. Using these numbers, one can obtain first and higher order derivatives "automatically" [1][2] (forward mode only, see refs.) [1] http://en.wikipedia.org/wiki/Automatic_differentiation#Automatic_differentiation_using_dual_numbers [2] http://conal.net/papers/beautiful-differentiation/ (recent paper with references, related blog posts and video of a talk) There is a group of people in the Haskell community [3][4] that are working on Automatic Differentiation via dual numbers (that gives you the "forward mode" only, see refs.). It looks really interesting. I saw somewhere that one could write external modules in Haskell for Python... [3] http://www.haskell.org/haskellwiki/Automatic_Differentiation [4] http://hackage.haskell.org/packages/archive/fad/1.0/doc/html/Numeric-FAD.html Just throwing ideas out there. I found it really neat, and potentially very useful. Now, as for the reverse mode of AD, there seems to be no shortcut, and it would be really nice, as it seems that the computational complexity of the reverse mode is lower than the forward mode. There are libraries, like TAPENADE [5] and ADIFOR [6] that do it, though. They do it by source code transformation. The TAPENADE tool even accepts C or Fortran code via the web and returns a differentiated code (worth playing with). Really neat. [5] http://www-sop.inria.fr/tropics/ [6] http://www.mcs.anl.gov/research/projects/adifor Ok, now all we need is easy access to these tools from Python, and we are set! :) -- Guilherme P. de Freitas http://www.gpfreitas.com From guilherme at gpfreitas.com Wed May 5 03:08:24 2010 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Wed, 5 May 2010 00:08:24 -0700 Subject: [Numpy-discussion] Improvement of performance In-Reply-To: References: <28452458.post@talk.nabble.com> Message-ID: Just to make this thread more useful for someone interested in these topics, this seems to be "the book" on automatic differentiation (it's one of the references in the autodiff website) "Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation" (2nd ed.), by Andreas Griewank and Andrea Walther. SIAM, 2008. Link: http://www.ec-securehost.com/SIAM/OT105.html From pschmidtke at mmb.pcb.ub.es Wed May 5 04:02:38 2010 From: pschmidtke at mmb.pcb.ub.es (Peter Schmidtke) Date: Wed, 05 May 2010 10:02:38 +0200 Subject: [Numpy-discussion] 3d plane to point cloud fitting using SVD Message-ID: <7f9779a7b1b4b8c8be07c5663ca74c50@mmb.pcb.ub.es> Dear Numpy Users, I want to fit a 3d plane into a 3d point cloud and I saw that one could use svd for this purpose. So as I am very fond of numpy I saw that svd was implementented in the linalg module. Currently I have a numpy array called xyz with n lines (number of points) and 3 columns (x,y,z). I calculated the centroid as : xyz0=npy.mean(xyz, axis=0) #calculate the centroid Next I shift the centroid of the point cloud to the origin with. M=xyz-xyz0 next I saw by matlab analogy (http://www.mathworks.co.jp/matlabcentral/newsreader/view_thread/262996) that I can write this : u,s,vh=numpy.linalg.linalg.svd(M) Then in the matlab analog they use the last column of vh to get the a,b,c coefficients for the equation a,b,c=vh[:, -1] in numpy The problem is that the equation ax+by+cz=0 does not represent the plan through my point cloud at all. What am I doing wrong, how can I get the a,b and c coefficients? Thanks in advance. -- Peter Schmidtke ---------------------- PhD Student at the Molecular Modeling and Bioinformatics Group Dep. Physical Chemistry Faculty of Pharmacy University of Barcelona From stefan at sun.ac.za Wed May 5 06:15:34 2010 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 5 May 2010 12:15:34 +0200 Subject: [Numpy-discussion] BUG: NumPy exposes the wrong include directory Message-ID: Hi all, Under Ubuntu, the NumPy headers are dumped under /usr/include/python2.6. This is not the location that should be exposed when another copy of NumPy is installed. The attached patch lets from numpy.distutils.system_info import get_info print get_info['numpy'] return the correct path. However, it's a quick hack, and I'd like someone familiar with numpy.distutils to have a look and suggest a proper fix. Regards St?fan -------------- next part -------------- A non-text attachment was scrubbed... Name: system_info.patch Type: text/x-diff Size: 734 bytes Desc: not available URL: From stefan at sun.ac.za Wed May 5 06:26:51 2010 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 5 May 2010 12:26:51 +0200 Subject: [Numpy-discussion] 3d plane to point cloud fitting using SVD In-Reply-To: <7f9779a7b1b4b8c8be07c5663ca74c50@mmb.pcb.ub.es> References: <7f9779a7b1b4b8c8be07c5663ca74c50@mmb.pcb.ub.es> Message-ID: Hi Peter On 5 May 2010 10:02, Peter Schmidtke wrote: > u,s,vh=numpy.linalg.linalg.svd(M) > > Then in the matlab analog they use the last column of vh to get the a,b,c > coefficients for the equation > a,b,c=vh[:, -1] in numpy Note that vh is the conjugate transpose of v. You are probably interested in the rows of vh (the columns of V in MATLAB parlance). Regards St?fan From tinauser at libero.it Wed May 5 06:30:07 2010 From: tinauser at libero.it (tino) Date: Wed, 5 May 2010 10:30:07 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?embedding/extending_numpy=3Aimport?= =?utf-8?q?=5Farray=28=29_problem?= Message-ID: Hi guys, I've a C++ code that calls a python script. The python script, in turn, should use C modules to retrieve data (arrays). I'm having some problem with import_array. I've placed the import_array() just after the initialization of the interpreter and after PyInit_module Py_Initialize(); Py_InitModule("py_c_if", PY_C_IF_Methods); import_array(); This are inside a function called main_PYinit,called within the main C function. When I tryed to compile, I got the at the line of import_array the error "function must return a value". If I change the __multiarray_api.h and make import_array() a function with void instead of #define, the compiltion succeed. However, when then I try to use some API function I get problems.In particular I used this code: char *camera_buffer = cam_data; npy_int dim[] = {rows,cols}; /* whatever the size of your data, in C-order */ PyObject *myarray; double *array_buffer; myarray = PyArray_SimpleNew(2, &dim, NPY_INT); Py_INCREF(myarray); array_buffer = (double *)PyArray_DATA(myarray); for (int i=0; i References: Message-ID: <4BE21BB6.4080805@silveregg.co.jp> On 05/04/2010 04:38 PM, Austin Bingham wrote: > > I admit I'm having trouble formulating questions to address my > problems, so please bear with me. > > Say I've got a shared library of utilities for working with numpy > arrays. It's intended to be used in multiple extension modules and in > some places that are not modules at all (e.g. C++ programs that embed > python and want to manipulate arrays directly.) > > One of the headers in this library (call it 'util.h') includes > arrayobject.h because, for example, it needs NPY_TYPES in some > template definitions. Should this 'util.h' define > PY_ARRAY_UNIQUE_SYMBOL? Or NO_IMPORT? It seems like the correct > answers are 'no' and 'yes', but that means that any user of this > header needs to be very aware of header inclusion order. For example, > if they want to include 'arrayobject.h' for their own reasons *and* > they want NO_IMPORT undefined, then they need to be sure to include > 'util.h' after 'arrayobject.h'. I still don't understand why you cannot just include the header file as is (without defining any of NO_IMPORT/PY_ARRAY_UNIQUE_SYMBOL). >From what I can see, the problem seems to be a conflation of two sets > of symbols: those influenced by the PY_ARRAY_UNIQUE_SYMBOL and > NO_IMPORT macros (broadly, the API functions), those that aren't > (types, enums, and so forth.) numpy headers are really messy - way too many macros, etc... Fixing it without breaking API compatibility is a lot of work, though, cheers, David From patrickmarshwx at gmail.com Wed May 5 23:21:39 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Wed, 5 May 2010 22:21:39 -0500 Subject: [Numpy-discussion] Brooken Toolchain (Mac 10.6) In-Reply-To: References: Message-ID: I apologize for not following up on this thread right away. I was sick for a couple of days and then there were several major severe weather outbreaks that required me to spend more time at work. With this said, I've resumed trying to build a DMG with Python2.5. The "export" suggestion below worked with the broken toolchain, but now I get a more issues. I get the following error, "Cannot compiler 'Python.h'. Perhaps you need to install python-dev|python-devel," but the problem is these files do exist! Looking through the build log it appears that it fails on importing and various other standard libraries. I'm left to guess that this is an issue with gcc-4.0 picking up the wrong headers? I tried to re-install gcc-4.0, however the gcc-4.0.mpkg won't install because gcc-4.2 is found on the disk, so I don't know how to test this theory. I've uploaded the output from the build attempt (with the Traceback appended to the bottom). Short of reverting back to a previous version of OSX, I'm not sure what to try next. I'm open to any suggestion(s) at this point. Build log: http://www.patricktmarsh.com/tmp/build.out Patrick On Tue, Apr 20, 2010 at 4:11 AM, Robin wrote: > To build against the python.org 2.5 you need to use the older gcc: > > export CC=/usr/bin/gcc-4.0 > export CXX=/usr/bin/g++-4.0 > > should do it. By default snow leopard uses 4.2 now, which doesn't > support the -Wno-long-double option used when building python. > > Cheers > > Robin > > On Mon, Apr 19, 2010 at 3:55 PM, Patrick Marsh > wrote: > > Greetings, > > Per my previous email, I'm trying to setup the release process for Numpy > on > > my MacBook Pro. When trying to build Numpy 1.4.1r3 with Python 2.5.4 I > get > > a broken toolchain error (below). I do not get this error when trying to > > build Numpy with Python 2.6.5 - and there is nothing fundamentally > different > > (that I know of) between my Python 2.5.4 and Python 2.6.5 environments. > To > > address a previous suggestion I received offline, I do have sufficient > > permissions (I've even tried using sudo, just to make sure) and I do have > > setuptools installed. > > Any help, would be appreciated. > > > > Patrick > > > > ======================================================================== > > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core > > -Inumpy/core/src/npymath -Inumpy/core/src/multiarray > -Inumpy/core/src/umath > > -Inumpy/core/include > > -I/Library/Frameworks/Python.framework/Versions/2.5/include/python2.5 -c' > > gcc: _configtest.c > > cc1: error: unrecognized command line option "-Wno-long-double" > > cc1: error: unrecognized command line option "-Wno-long-double" > > lipo: can't figure out the architecture type of: /var/tmp//ccjTUva4.out > > cc1: error: unrecognized command line option "-Wno-long-double" > > cc1: error: unrecognized command line option "-Wno-long-double" > > lipo: can't figure out the architecture type of: /var/tmp//ccjTUva4.out > > failure. > > removing: _configtest.c _configtest.o > > Traceback (most recent call last): > > File "setup.py", line 187, in > > setup_package() > > File "setup.py", line 180, in setup_package > > configuration=configuration ) > > File "/Users/pmarsh/git/numpy.release/numpy/distutils/core.py", line > 186, > > in setup > > return old_setup(**new_attr) > > File > > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/core.py", > > line 151, in setup > > dist.run_commands() > > File > > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/dist.py", > > line 974, in run_commands > > self.run_command(cmd) > > File > > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/dist.py", > > line 994, in run_command > > cmd_obj.run() > > File > "/Users/pmarsh/git/numpy.release/numpy/distutils/command/build.py", > > line 37, in run > > old_build.run(self) > > File > > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/command/build.py", > > line 112, in run > > self.run_command(cmd_name) > > File > > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/cmd.py", > > line 333, in run_command > > self.distribution.run_command(command) > > File > > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/dist.py", > > line 994, in run_command > > cmd_obj.run() > > File > > "/Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py", > line > > 152, in run > > self.build_sources() > > File > > "/Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py", > line > > 163, in build_sources > > self.build_library_sources(*libname_info) > > File > > "/Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py", > line > > 298, in build_library_sources > > sources = self.generate_sources(sources, (lib_name, build_info)) > > File > > "/Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py", > line > > 385, in generate_sources > > source = func(extension, build_dir) > > File "numpy/core/setup.py", line 657, in get_mathlib_info > > raise RuntimeError("Broken toolchain: cannot link a simple C > program") > > RuntimeError: Broken toolchain: cannot link a simple C program > > > > > > -- > > Patrick Marsh > > Ph.D. Student / NSSL Liaison to the HWT > > School of Meteorology / University of Oklahoma > > Cooperative Institute for Mesoscale Meteorological Studies > > National Severe Storms Laboratory > > http://www.patricktmarsh.com > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From austin.bingham at gmail.com Thu May 6 02:10:35 2010 From: austin.bingham at gmail.com (Austin Bingham) Date: Thu, 6 May 2010 08:10:35 +0200 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL is too far reaching? In-Reply-To: <4BE21BB6.4080805@silveregg.co.jp> References: <4BE21BB6.4080805@silveregg.co.jp> Message-ID: > I still don't understand why you cannot just include the header file as > is (without defining any of NO_IMPORT/PY_ARRAY_UNIQUE_SYMBOL). I guess the real point is that no matter what definition (or lack thereof) that I have for these macros, I still introduce header order dependencies to users of my library if I include arrayobject.h in one of my headers. Suppose I defined neither macro in my 'util.h', and that I included 'arrayobject.h'. If a user of my library did this: #include // <-- my library's header #define PY_ARRAY_UNIQUE_SYMBOL MY_UNIQUE_SYMBOL #define NO_IMPORT #include ... they'd likely crash. The inclusion of arrayobject.h in util.h activates the include guards in arrayobject.h, so the second inclusion has no real effect; their calls to numpy API methods would be made against garbage pointers. As a result, unless my library's user is keenly aware of what's going on, the API function pointers will not get set properly. In this case, of course, reordering the includes will probably fix the issue. But it's a classic example of an unhygienic header, and I think we can avoid this very easily (see below). > numpy headers are really messy - way too many macros, etc... Fixing it > without breaking API compatibility is a lot of work, though, That may be true in general, but it looks like there might be a simple solution in this case. In my copy of numpy (1.3.0), I've moved everything in ndarrayobject.h between the "CONFUSE_EMACS" stuff and the inclusion of "__multiarray_api.h" into a new header, nonfunc_api.h (though this is clearly a temporary name at best!). ndarrayobject.h now includes nonfunc_api.h in place of all of the removed code, and my util.h includes nonfunc_api.h instead of arrayobject.h. The result is that existing users of the numpy API are (I believe) completely unaffected. However, the new header makes it possible to include a lot of type definitions, enumerations, and all sorts of other stuff...in my thinking, everything from ndarrayobject.h that *doesn't* depend on the macros...without adding the burden of actually needing to consider the macros. FWIW, this arrangement seems to work for my projects. I haven't applied this patch, rebuilt numpy, and run the unittests, though I'd like to when I get a chance. Austin From martin.raspaud at smhi.se Thu May 6 02:50:33 2010 From: martin.raspaud at smhi.se (Martin Raspaud) Date: Thu, 06 May 2010 08:50:33 +0200 Subject: [Numpy-discussion] Decision tree-like algorithm on numpy arrays Message-ID: <4BE266B9.4050804@smhi.se> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all, I have an old c-extension I want to remove from my code to the benefit of numpy, but it looks kind of tricky to me. Here is the thing: I have a number of arrays of the same shape. On these arrays, I run a sequence of tests, leading to a kind of decision tree. In the end, based on these tests, I get a number of result arrays where, based on the tests, each element gets a value. The way to do this in an efficient way with numpy is quite unclear to me. My first thought would be: result_array1 = np.where(some_test_on(array1), np.where(some_test_on(array2), 1, 2), np.where(some_test_on(array3, array4), np.where(some_test_on(array5), 3, 4), 4)) result_array2 = np.where(some_test_on(array1), np.where(some_test_on(array2), True, True), np.where(some_test_on(array3, array4), np.where(some_test_on(array5), True, False), True)) etc... but that means running the same tests several times, which is not acceptable if the tests are lengthy. In order to avoid this problem I could also have some mask based on each test: mask1 = some_test_on(array1) mask2 = some_test_on(array2[mask1]) mask3 = some_test_on(array3[!mask1], array4[!mask1]) mask4 = some_test_on(array5[!mask1][mask3]) result_array1[mask1][mask2] = 1 result_array1[mask1][!mask2] = 2 result_array1[!mask1][mask3][mask4] = 3 result_array1[!mask1][mask3][!mask4] = 4 result_array1[!mask1][!mask3] = 4 result_array2[mask1][mask2] = True result_array2[mask1][!mask2] = True result_array2[!mask1][mask3][mask4] = True result_array2[!mask1][mask3][!mask4] = False result_array2[!mask1][!mask3] = True etc... but that looks a bit clumsy to me... The way it was done in the C-extension was to run the decision tree on each element sequentially, but I have the feeling that would not be very efficient with numpy (although I know I can't beat pure C code, I would like to have comparable times). Does any of you wise people have an opinion on this ? Thanks, Martin -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJL4ma5AAoJEBdvyODiyJI4SpgH/i0bb7PH8oTu481NRuYmbi40 VwJrOCdfSo6CauLBiIdxBZV2Hksbu2iDu5GEKJNUObf9bM7N+LK+qMwaBq1M5hF+ 47yNczSEUaxshBHzUFQMlS9XEtZewhYZGepkH1oThIQbSD2IbM6fWkVj+EJRwwJ5 2Ia4p1GIdLGMZ3loaWevvCmz8kjppX7Feei0hEP28+HIiWq/qmUlccYZm/ThZcFE 6ROEKtkepKsf3vOfpuS5Hr6U1Hb4mo7u9SmUcOvlCby6q/TbVtwAZjpRQB4qKEjm DRj9EvyWBnINgr3tKVN2Cida1El8Ki9jBjhx2GxLsy78pNKqZMI9UC/iM8cehYQ= =hUa2 -----END PGP SIGNATURE----- -------------- next part -------------- A non-text attachment was scrubbed... Name: martin_raspaud.vcf Type: text/x-vcard Size: 260 bytes Desc: not available URL: From meine at informatik.uni-hamburg.de Thu May 6 04:21:24 2010 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Thu, 6 May 2010 10:21:24 +0200 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL is too far reaching? In-Reply-To: References: <4BE21BB6.4080805@silveregg.co.jp> Message-ID: <201005061021.25162.meine@informatik.uni-hamburg.de> Am Donnerstag 06 Mai 2010 08:10:35 schrieb Austin Bingham: > Suppose I defined neither macro in my 'util.h', and that I included > 'arrayobject.h'. If a user of my library did this: > > #include // <-- my library's header > > #define PY_ARRAY_UNIQUE_SYMBOL MY_UNIQUE_SYMBOL > #define NO_IMPORT > #include > > ... > > they'd likely crash. Really? Wouldn't it be really easy to check for this situation, i.e. augment the inclusion guards by some "if included before, but PY_ARRAY_UNIQUE_SYMBOL/NO_IMPORT settings are different than the last time, fail and tell the user about it"? At least that would give a compile error at an earlier point in time. HTH, Hans From austin.bingham at gmail.com Thu May 6 05:08:59 2010 From: austin.bingham at gmail.com (Austin Bingham) Date: Thu, 6 May 2010 11:08:59 +0200 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL is too far reaching? In-Reply-To: <201005061021.25162.meine@informatik.uni-hamburg.de> References: <4BE21BB6.4080805@silveregg.co.jp> <201005061021.25162.meine@informatik.uni-hamburg.de> Message-ID: >> they'd likely crash. > > Really? I base that on the assumption that they'd not know to call import_array() in that translation unit. This seems like a reasonable assumption because, by defining the macros as such, they are strongly implying that they expect the API functions to be imported for their definition of PY_ARRAY_UNIQUE_SYMBOL in some other place. Of course, their powers of inference and patience might be very strong, in which case they'd make sure to define those pointers, but that seems like a lot to ask of users. > Wouldn't it be really easy to check for this situation, i.e. augment > the inclusion guards by some "if included before, but > PY_ARRAY_UNIQUE_SYMBOL/NO_IMPORT settings are different than the last time, > fail and tell the user about it"? > > At least that would give a compile error at an earlier point in time. Yes, that might be easy to do, and it's probably a good idea, but it's not an argument against normalizing (to abuse a term) the headers where possible. All the complication revolves around the API function pointers; as a user of numpy, I find it a bit frustrating that I have to concern myself with those complications when what I *really* want has nothing to do with those functions. Austin From charlesr.harris at gmail.com Thu May 6 10:23:15 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 May 2010 08:23:15 -0600 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL is too far reaching? In-Reply-To: References: <4BE21BB6.4080805@silveregg.co.jp> <201005061021.25162.meine@informatik.uni-hamburg.de> Message-ID: On Thu, May 6, 2010 at 8:21 AM, Charles R Harris wrote: > > > On Thu, May 6, 2010 at 3:08 AM, Austin Bingham wrote: > >> >> they'd likely crash. >> > >> > Really? >> >> I base that on the assumption that they'd not know to call >> import_array() in that translation unit. This seems like a reasonable >> assumption because, by defining the macros as such, they are strongly >> implying that they expect the API functions to be imported for their >> definition of PY_ARRAY_UNIQUE_SYMBOL in some other place. Of course, >> their powers of inference and patience might be very strong, in which >> case they'd make sure to define those pointers, but that seems like a >> lot to ask of users. >> >> > Wouldn't it be really easy to check for this situation, i.e. augment >> > the inclusion guards by some "if included before, but >> > PY_ARRAY_UNIQUE_SYMBOL/NO_IMPORT settings are different than the last >> time, >> > fail and tell the user about it"? >> > >> > At least that would give a compile error at an earlier point in time. >> >> Yes, that might be easy to do, and it's probably a good idea, but it's >> not an argument against normalizing (to abuse a term) the headers >> where possible. All the complication revolves around the API function >> pointers; as a user of numpy, I find it a bit frustrating that I have >> to concern myself with those complications when what I *really* want >> has nothing to do with those functions. >> >> > Welcome to open source and the joys of backward compatibility ;) I like > your idea for breaking the header up, we really do need to try working on > the header situation and I think your suggestion could be helpful without > breaking current usage. > > Go ahead and open a ticket and provide a patch. Mark it as needs review. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 6 10:21:16 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 May 2010 08:21:16 -0600 Subject: [Numpy-discussion] PY_ARRAY_UNIQUE_SYMBOL is too far reaching? In-Reply-To: References: <4BE21BB6.4080805@silveregg.co.jp> <201005061021.25162.meine@informatik.uni-hamburg.de> Message-ID: On Thu, May 6, 2010 at 3:08 AM, Austin Bingham wrote: > >> they'd likely crash. > > > > Really? > > I base that on the assumption that they'd not know to call > import_array() in that translation unit. This seems like a reasonable > assumption because, by defining the macros as such, they are strongly > implying that they expect the API functions to be imported for their > definition of PY_ARRAY_UNIQUE_SYMBOL in some other place. Of course, > their powers of inference and patience might be very strong, in which > case they'd make sure to define those pointers, but that seems like a > lot to ask of users. > > > Wouldn't it be really easy to check for this situation, i.e. augment > > the inclusion guards by some "if included before, but > > PY_ARRAY_UNIQUE_SYMBOL/NO_IMPORT settings are different than the last > time, > > fail and tell the user about it"? > > > > At least that would give a compile error at an earlier point in time. > > Yes, that might be easy to do, and it's probably a good idea, but it's > not an argument against normalizing (to abuse a term) the headers > where possible. All the complication revolves around the API function > pointers; as a user of numpy, I find it a bit frustrating that I have > to concern myself with those complications when what I *really* want > has nothing to do with those functions. > > Welcome to open source and the joys of backward compatibility ;) I like your idea for breaking the header up, we really do need to try working on the header situation and I think your suggestion could be helpful without breaking current usage. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Thu May 6 11:07:34 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 6 May 2010 17:07:34 +0200 Subject: [Numpy-discussion] Decision tree-like algorithm on numpy arrays In-Reply-To: <4BE266B9.4050804@smhi.se> References: <4BE266B9.4050804@smhi.se> Message-ID: <201005061707.34118.faltet@pytables.org> Hi Martin, A Thursday 06 May 2010 08:50:33 Martin Raspaud escrigu?: > Hi all, > > I have an old c-extension I want to remove from my code to the benefit of > numpy, but it looks kind of tricky to me. > > Here is the thing: > I have a number of arrays of the same shape. > On these arrays, I run a sequence of tests, leading to a kind of decision > tree. In the end, based on these tests, I get a number of result arrays > where, based on the tests, each element gets a value. > > The way to do this in an efficient way with numpy is quite unclear to me. > My first thought would be: > > result_array1 = np.where(some_test_on(array1), > np.where(some_test_on(array2), > 1, > 2), > np.where(some_test_on(array3, array4), > np.where(some_test_on(array5), > 3, > 4), > 4)) > > result_array2 = np.where(some_test_on(array1), > np.where(some_test_on(array2), > True, > True), > np.where(some_test_on(array3, array4), > np.where(some_test_on(array5), > True, > False), > True)) > > etc... but that means running the same tests several times, which is not > acceptable if the tests are lengthy. The problem with performance, rather than being running the same tests several times, I'd say that it is more how NumPy deals with temporaries (i.e. it is a memory access problem). You may want to try numexpr in order to speed-up this sort of computations. For example, the next code: #------------------------------------------------------------------------ import numpy as np import numexpr as ne # if you don't have numexpr installed, but PyTables, try this instead #from tables import numexpr as ne from time import time N = 1e7 array1 = np.random.random(N) array2 = np.random.random(N) array3 = np.random.random(N) array4 = np.random.random(N) array5 = np.random.random(N) t0 = time() result_array1 = np.where(array1 > 0.5, np.where(array2 < 0.5, 1, 2), np.where(((array3 >.2) & (array4 < .1)), np.where(array5 >= .1, 3, 4), 4)) t = round(time() - t0, 3) print "result_array1:", result_array1, t t0 = time() result_array2 = ne.evaluate("""where(array1 > 0.5, where(array2 < 0.5, 1, 2), where(((array3 >.2) & (array4 < .1)), where(array5 >= .1, 3, 4), 4))""") t = round(time() - t0, 3) print "result_array2:", result_array2, t assert np.allclose(result_array1, result_array2) #------------------------------------------------------------------------ and the output for my machine: result_array1: [4 2 4 ..., 1 3 4] 1.819 result_array2: [4 2 4 ..., 1 3 4] 0.308 which is a 6x speed-up. I suppose this should be pretty close of what you can get with C. -- Francesc Alted From ralf.gommers at googlemail.com Thu May 6 11:44:44 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 6 May 2010 23:44:44 +0800 Subject: [Numpy-discussion] Brooken Toolchain (Mac 10.6) In-Reply-To: References: Message-ID: On Thu, May 6, 2010 at 11:21 AM, Patrick Marsh wrote: > I apologize for not following up on this thread right away. I was sick for > a couple of days and then there were several major severe weather outbreaks > that required me to spend more time at work. With this said, I've resumed > trying to build a DMG with Python2.5. > > The "export" suggestion below worked with the broken toolchain, but now I > get a more issues. I get the following error, "Cannot compiler 'Python.h'. > Perhaps you need to install python-dev|python-devel," but the problem is > these files do exist! Looking through the build log it appears that it > fails on importing and various other standard libraries. > > I'm left to guess that this is an issue with gcc-4.0 picking up the wrong > headers? I tried to re-install gcc-4.0, however the gcc-4.0.mpkg won't > install because gcc-4.2 is found on the disk, so I don't know how to test > this theory. I've uploaded the output from the build attempt (with the > Traceback appended to the bottom). > > Short of reverting back to a previous version of OSX, I'm not sure what to > try next. I'm open to any suggestion(s) at this point. > > > Build log: http://www.patricktmarsh.com/tmp/build.out > Looks like an incomplete install of the 10.4 SDK again. What do you get for $ locate limits.h under /Developer/SDKs/MacOSX10.4u.sdk/ ? On my system it's the following: /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/i386/_limits.h /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/i386/limits.h /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/machine/_limits.h /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/machine/limits.h /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/ppc/_limits.h /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/ppc/limits.h /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/sys/syslimits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/include/gcc/darwin/3.3/machine/limits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/include/i386/_limits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/include/i386/limits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/include/limits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/include/machine/_limits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/include/machine/limits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/include/ppc/_limits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/include/ppc/limits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/include/sys/syslimits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/i686-apple-darwin10/4.0.1/include/limits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/i686-apple-darwin10/4.0.1/include/syslimits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/powerpc-apple-darwin10/4.0.1/include/limits.h /Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/powerpc-apple-darwin10/4.0.1/include/syslimits.h Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjhnson at gmail.com Thu May 6 13:25:38 2010 From: tjhnson at gmail.com (T J) Date: Thu, 6 May 2010 10:25:38 -0700 Subject: [Numpy-discussion] Remove duplicate columns Message-ID: Hi, Is there a way to sort the columns in an array? I need to sort it so that I can easily go through and keep only the unique columns. ndarray.sort(axis=1) doesn't do what I want as it destroys the relative ordering between the various columns. For example, I would like: [[2,1,3], [3,5,1], [0,3,1]] to go to: [[1,2,3], [5,3,1], [3,0,1]] (swap the first and second columns). So I want to treat the columns as objects and sort them. I can do this if I convert to a python list, but I was hoping to avoid doing that because I ultimately need to do element-wise bitwise operations. Thanks! From kwgoodman at gmail.com Thu May 6 13:34:19 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 6 May 2010 10:34:19 -0700 Subject: [Numpy-discussion] Remove duplicate columns In-Reply-To: References: Message-ID: On Thu, May 6, 2010 at 10:25 AM, T J wrote: > Hi, > > Is there a way to sort the columns in an array? ?I need to sort it so > that I can easily go through and keep only the unique columns. > ndarray.sort(axis=1) doesn't do what I want as it destroys the > relative ordering between the various columns. For example, I would > like: > > [[2,1,3], > ?[3,5,1], > ?[0,3,1]] > > to go to: > > [[1,2,3], > ?[5,3,1], > ?[3,0,1]] > > (swap the first and second columns). ?So I want to treat the columns > as objects and sort them. ?I can do this if I convert to a python > list, but I was hoping to avoid doing that because I ultimately need > to do element-wise bitwise operations. Assuming you want to sort columns by the values in the first row: >> x array([[2, 1, 3], [3, 5, 1], [0, 3, 1]]) >> idx = x[0,:].argsort() >> x[:,idx] array([[1, 2, 3], [5, 3, 1], [3, 0, 1]]) From josef.pktd at gmail.com Thu May 6 13:36:42 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 6 May 2010 13:36:42 -0400 Subject: [Numpy-discussion] Remove duplicate columns In-Reply-To: References: Message-ID: On Thu, May 6, 2010 at 1:25 PM, T J wrote: > Hi, > > Is there a way to sort the columns in an array? ?I need to sort it so > that I can easily go through and keep only the unique columns. > ndarray.sort(axis=1) doesn't do what I want as it destroys the > relative ordering between the various columns. For example, I would > like: > > [[2,1,3], > ?[3,5,1], > ?[0,3,1]] > > to go to: > > [[1,2,3], > ?[5,3,1], > ?[3,0,1]] > > (swap the first and second columns). ?So I want to treat the columns > as objects and sort them. ?I can do this if I convert to a python > list, but I was hoping to avoid doing that because I ultimately need > to do element-wise bitwise operations. there is a thread last august on unique rows which might be useful, and a thread in Dec 2008 for sorting rows something like np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1]) maybe it's np.unique with numpy 1.4. Josef > > Thanks! > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From amenity at enthought.com Thu May 6 14:53:12 2010 From: amenity at enthought.com (Amenity Applewhite) Date: Thu, 6 May 2010 13:53:12 -0500 Subject: [Numpy-discussion] SciPy 2010: Bioinformatic & Parallel/cloud talks announced...& register now! References: Message-ID: <48351A11-7A2D-4A12-B9EE-9483046DA10A@enthought.com> Hello! Things are moving quickly in preparation for SciPy 2010: Last week we announced the General Conference schedule (http://conference.scipy.org/scipy2010/schedule.html ), Tuesday we announced our student sponsorship recipients (http://conference.scipy.org/scipy2010/student.html ) and now we're ready to tell you give you a look at the talks we have lined up for our Bioinformatics and Parallel Processing /Cloud Computing tracks. ===Parallel Processing & Cloud Computing track=== We really appreciate Brian and Ken's work organizing the papers for this specialized track. And of course, thanks to everyone who submitted a paper. There has been a great deal of interest in this set of talks ? and word on the street is that Brian may even have a HPC tutorial up his sleeve... * StarCluster - NumPy/SciPy Computing in the Cloud- Justin Riley * pomsets: workflow management for your cloud- Michael J Pan * Getting Down with Big Data Jared Flatow, Anita Lillie, Ville Tuulos * StarFlow: A Cloud-Enables Python Workflow Engine for Scientific Analysis Pipelines Elaine Angelino, Dan Yamins, Margo Seltzer * A Programmatic Interface for Particle Plasma Simulation in Python, and Early Backend Results with PyCUDA Min Ragan-Kelley * Parallel Computing with IPython: an Application to Air Pollution Modeling B.E. Granger, J.G. Hemann * Astronomy App in the Cloud using Google Geo APIs and Python App Engine Shawn Shen ===Bioinformatics track=== Once again, we are indebted to Glen Otero, from Dell, for putting together the Bioinformatics track. He received some fantastic papers and we're really looking forward to these presentations: * Protein Folding with Python on Supercomputers Jan H. Meinke * Can Python Save Next-Generation Sequencing? * The Use of Galaxy for the Research and the Teaching of Genomics Roy Weckiewicz, Jim Hu, and Rodolfo Aramayo ===Early registration ends next Monday=== That's right: Only a few days left before rates increase! Think of all the BBQ and breakfast tacos you can buy with that $50-$100 you'll save by registering early. If that doesn't convince you, consider: -Cheap flights to Austin- Buy your tickets now for some very nice prices: $275 from Chicago, $330 from San Francisco, $380 from New York City, $810 from London...(prices from Kayak.com) -Convenient & affordable hotel- We got an fantastic deal for on-site accommodations at the AT&T Conference Center. Pay only $89/night for single occupancy or $105/ night for double occupancy. It will be great to have everyone staying in the same spot. Once you register, you'll get a code to book your hotel reservation. The discounted rate will be applied automatically. https://conference.scipy.org/scipy2010/accommodation.html No car necessary to get to the conference... and see Austin! An airport bus (http://capmetro.org/riding/current_schedules/maps/rt100_sb.pdf ) runs straight to and from the AT&T center, so you won't have to rent a car at all. Plus, the UT campus area is in walking distance to a number of great restaurants and activities. For any longer trips you'd like to make Austin has a great public bus system. Not to mention all of the mind-blowing things you'll learn and outstanding people you'll meet and catch up with. So what are you waiting for? Register: https://conference.scipy.org/scipy2010/registration.html Best, The SciPy 2010 Team @SciPy2010 on Twitter From tjhnson at gmail.com Thu May 6 16:37:25 2010 From: tjhnson at gmail.com (T J) Date: Thu, 6 May 2010 13:37:25 -0700 Subject: [Numpy-discussion] Remove duplicate columns In-Reply-To: References: Message-ID: On Thu, May 6, 2010 at 10:34 AM, Keith Goodman wrote: > On Thu, May 6, 2010 at 10:25 AM, T J wrote: >> Hi, >> >> Is there a way to sort the columns in an array? ?I need to sort it so >> that I can easily go through and keep only the unique columns. >> ndarray.sort(axis=1) doesn't do what I want as it destroys the >> relative ordering between the various columns. For example, I would >> like: >> >> [[2,1,3], >> ?[3,5,1], >> ?[0,3,1]] >> >> to go to: >> >> [[1,2,3], >> ?[5,3,1], >> ?[3,0,1]] >> >> (swap the first and second columns). ?So I want to treat the columns >> as objects and sort them. ?I can do this if I convert to a python >> list, but I was hoping to avoid doing that because I ultimately need >> to do element-wise bitwise operations. > > Assuming you want to sort columns by the values in the first row: > Not quite. I want the columns treated as objects...not as the first element in the column. A better example: >>> x array([[3, 2, 2, 2, 2], [2, 2, 0, 2, 2], [0, 1, 1, 0, 1], [5, 5, 3, 0, 5]]) >>> desired array([[2, 2, 2, 2, 3], [0, 2, 2, 2, 2], [1, 0, 1, 1, 0], [3, 0, 5, 5, 5]]) >>> what_is_really_desired array([0,1,2,3]) # signifying unique columns From aisaac at american.edu Thu May 6 16:40:17 2010 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 06 May 2010 16:40:17 -0400 Subject: [Numpy-discussion] trouble with bool_ Message-ID: <4BE32931.7070802@american.edu> What information exactly is `isnan` supposed to communicate? Put another way, given that it raises NotImplemented for unknown types, and that bool(NotImplemented) is True, is there a reason by it cannot return a Python bool (which seems more useful)? Thanks, Alan Isaac > >> np.isnan(np.nan) True > >> np.isnan(np.nan) is True False > >> type(np.isnan(np.nan)) > >> np.isnan('') NotImplemented > >> bool(_) True > >> if np.isnan(''): print "Uh oh" ... Uh oh From tjhnson at gmail.com Thu May 6 16:45:06 2010 From: tjhnson at gmail.com (T J) Date: Thu, 6 May 2010 13:45:06 -0700 Subject: [Numpy-discussion] Remove duplicate columns In-Reply-To: References: Message-ID: On Thu, May 6, 2010 at 10:36 AM, wrote: > > there is a thread last august on unique rows which might be useful, > and a thread in Dec 2008 for sorting rows > > something like > > np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1]) > > maybe it's np.unique with numpy 1.4. > The thread is useful: http://www.mail-archive.com/numpy-discussion at scipy.org/msg19830.html I'll have to see if it is quicker for me to just do: >>> y = x.transpose().tolist() >>> y.sort() >>> x = np.array(y).transpose() From josef.pktd at gmail.com Thu May 6 17:42:12 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 6 May 2010 17:42:12 -0400 Subject: [Numpy-discussion] Remove duplicate columns In-Reply-To: References: Message-ID: On Thu, May 6, 2010 at 4:45 PM, T J wrote: > On Thu, May 6, 2010 at 10:36 AM, ? wrote: >> >> there is a thread last august on unique rows which might be useful, >> and a thread in Dec 2008 for sorting rows >> >> something like >> >> np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1]) >> >> maybe it's np.unique with numpy 1.4. >> > > The thread is useful: > > ?http://www.mail-archive.com/numpy-discussion at scipy.org/msg19830.html > > I'll have to see if it is quicker for me to just do: > >>>> y = x.transpose().tolist() >>>> y.sort() >>>> x = np.array(y).transpose() for sure it's easier to read. the difference might be temporary array creation compared to using numpy.sort on a view. Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From liukis at usc.edu Thu May 6 19:34:14 2010 From: liukis at usc.edu (Maria Liukis) Date: Thu, 06 May 2010 16:34:14 -0700 Subject: [Numpy-discussion] saving object array to ascii file Message-ID: <42D29088-C5A3-44E2-8C10-9185765A10C9@usc.edu> Hello Everybody, Sorry if it's a trivial question. I'm trying to find out if there is a way to save object array to ascii file. numpy.savetxt() in NumPy V1.3.0 doesn't seem to work: >>> import numpy as np >>> obj_arr = np.zeros((2,), dtype=np.object) >>> obj_arr[0] = np.array([[1,2,3], [4,5,6], [7,8,9]]) >>> obj_arr[1] = np.array([[10,11], [12,13]]) >>> obj_arr array([[[1 2 3] [4 5 6] [7 8 9]], [[10 11] [12 13]]], dtype=object) >>> np.savetxt('obj_array.dat', obj_arr) Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/io.py", line 636, in savetxt fh.write(format % tuple(row) + '\n') TypeError: float argument required >>> scipy.io.savemat() supports Matlab format of the object array, but I could not find any documentation on ASCII file format for object arrays. Thanks in advance, Masha -------------------- liukis at usc.edu From charlesr.harris at gmail.com Thu May 6 23:45:25 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 May 2010 21:45:25 -0600 Subject: [Numpy-discussion] Remove duplicate columns In-Reply-To: References: Message-ID: On Thu, May 6, 2010 at 11:25 AM, T J wrote: > Hi, > > Is there a way to sort the columns in an array? I need to sort it so > that I can easily go through and keep only the unique columns. > ndarray.sort(axis=1) doesn't do what I want as it destroys the > relative ordering between the various columns. For example, I would > like: > > [[2,1,3], > [3,5,1], > [0,3,1]] > > to go to: > > [[1,2,3], > [5,3,1], > [3,0,1]] > > (swap the first and second columns). So I want to treat the columns > as objects and sort them. I can do this if I convert to a python > list, but I was hoping to avoid doing that because I ultimately need > to do element-wise bitwise operations. > > To get the order illustrated: In [9]: a = array([[2,1,3],[3,5,1],[0,3,1]]) In [10]: i = lexsort([a[::-1][i] for i in range(3)]) In [11]: a[:,i] Out[11]: array([[1, 2, 3], [5, 3, 1], [3, 0, 1]]) But if you just want them sorted, it is easier to do In [12]: i = lexsort([a[i] for i in range(3)]) In [13]: a[:,i] Out[13]: array([[2, 3, 1], [3, 1, 5], [0, 1, 3]]) or just In [18]: a[:,lexsort(a)] Out[18]: array([[2, 3, 1], [3, 1, 5], [0, 1, 3]]) For the bigger array In [21]: a Out[21]: array([[3, 2, 2, 2, 2], [2, 2, 0, 2, 2], [0, 1, 1, 0, 1], [5, 5, 3, 0, 5]]) In [22]: a[:, lexsort(a)] Out[22]: array([[2, 2, 3, 2, 2], [2, 0, 2, 2, 2], [0, 1, 0, 1, 1], [0, 3, 5, 5, 5]]) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrickmarshwx at gmail.com Fri May 7 01:47:40 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Fri, 7 May 2010 00:47:40 -0500 Subject: [Numpy-discussion] Brooken Toolchain (Mac 10.6) In-Reply-To: References: Message-ID: I don't have that directory on my machine, so I'm trying to track down my OSX 10.4 install disc to see if I can reinstall it. Thanks for the suggestion; I'll let you know tomorrow what happens. Patrick On Thu, May 6, 2010 at 10:44 AM, Ralf Gommers wrote: > > > On Thu, May 6, 2010 at 11:21 AM, Patrick Marsh wrote: > >> I apologize for not following up on this thread right away. I was sick >> for a couple of days and then there were several major severe weather >> outbreaks that required me to spend more time at work. With this said, I've >> resumed trying to build a DMG with Python2.5. >> >> The "export" suggestion below worked with the broken toolchain, but now I >> get a more issues. I get the following error, "Cannot compiler 'Python.h'. >> Perhaps you need to install python-dev|python-devel," but the problem is >> these files do exist! Looking through the build log it appears that it >> fails on importing and various other standard libraries. >> >> I'm left to guess that this is an issue with gcc-4.0 picking up the wrong >> headers? I tried to re-install gcc-4.0, however the gcc-4.0.mpkg won't >> install because gcc-4.2 is found on the disk, so I don't know how to test >> this theory. I've uploaded the output from the build attempt (with the >> Traceback appended to the bottom). >> >> Short of reverting back to a previous version of OSX, I'm not sure what to >> try next. I'm open to any suggestion(s) at this point. >> >> >> Build log: http://www.patricktmarsh.com/tmp/build.out >> > > Looks like an incomplete install of the 10.4 SDK again. What do you get for > > $ locate limits.h > under /Developer/SDKs/MacOSX10.4u.sdk/ ? On my system it's the following: > > > /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/i386/_limits.h > > /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/i386/limits.h > > /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/machine/_limits.h > > /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/machine/limits.h > > /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/ppc/_limits.h > > /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/ppc/limits.h > > /Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/sys/syslimits.h > /Developer/SDKs/MacOSX10.4u.sdk/usr/include/gcc/darwin/3.3/machine/limits.h > /Developer/SDKs/MacOSX10.4u.sdk/usr/include/i386/_limits.h > /Developer/SDKs/MacOSX10.4u.sdk/usr/include/i386/limits.h > /Developer/SDKs/MacOSX10.4u.sdk/usr/include/limits.h > /Developer/SDKs/MacOSX10.4u.sdk/usr/include/machine/_limits.h > /Developer/SDKs/MacOSX10.4u.sdk/usr/include/machine/limits.h > /Developer/SDKs/MacOSX10.4u.sdk/usr/include/ppc/_limits.h > /Developer/SDKs/MacOSX10.4u.sdk/usr/include/ppc/limits.h > /Developer/SDKs/MacOSX10.4u.sdk/usr/include/sys/syslimits.h > > /Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/i686-apple-darwin10/4.0.1/include/limits.h > > /Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/i686-apple-darwin10/4.0.1/include/syslimits.h > > /Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/powerpc-apple-darwin10/4.0.1/include/limits.h > /Developer/SDKs/MacOSX10.4u.sdk/usr/lib/gcc/powerpc-apple-darwin10/4.0.1/include/syslimits.h > > > > Cheers, > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Fri May 7 02:11:08 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 7 May 2010 14:11:08 +0800 Subject: [Numpy-discussion] Brooken Toolchain (Mac 10.6) In-Reply-To: References: Message-ID: On Fri, May 7, 2010 at 1:47 PM, Patrick Marsh wrote: > I don't have that directory on my machine, so I'm trying to track down my > OSX 10.4 install disc to see if I can reinstall it. Thanks for the > suggestion; I'll let you know tomorrow what happens. > You don't need the 10.4 disc, the 10.4 SDK is an optional install with XCode on your Snow Leopard DVD. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.raspaud at smhi.se Fri May 7 02:18:44 2010 From: martin.raspaud at smhi.se (Martin Raspaud) Date: Fri, 07 May 2010 08:18:44 +0200 Subject: [Numpy-discussion] Decision tree-like algorithm on numpy arrays In-Reply-To: <201005061707.34118.faltet@pytables.org> References: <4BE266B9.4050804@smhi.se> <201005061707.34118.faltet@pytables.org> Message-ID: <4BE3B0C4.1060205@smhi.se> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Francesc Alted skrev: > Hi Martin, [...] > > and the output for my machine: > > result_array1: [4 2 4 ..., 1 3 4] 1.819 > result_array2: [4 2 4 ..., 1 3 4] 0.308 > > which is a 6x speed-up. I suppose this should be pretty close of what you can > get with C. > Hi Francesc, Thanks a lot for the idea ! This looks nice, I wasn't aware of numexpr. I guess it's no problem to run "evaluate" on user defined functions ? Thanks, Martin -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJL47DEAAoJEBdvyODiyJI4j64IAMRdLOzrO8vGlxMvJcMaUR8l /NgeOSTCSfwuFMJRq9F8CWalQ8r/rTAU0XLIVTMPnKneHiclavFQCi3bhevYnInK Iyi1JBd+S058g3TlfrX4aJ2qmuddlBnMeUXcmRuOgmOggm8wtoMF66bNc+LtiDgZ Z/D4FlaYu1Vv9/NMD9n6jIm1Cynx/oznblW93RsWtA/gWHNtwRcERyLWTFoai8VW 0YfjbqRpi2f5wM1tc3IgkN4MRNy+MUHpMQ1s9DAi8syunno8R+hwKi0XKm6dMRxW 2l9r6NPlHv7AtgznO1HRVW2u6gg+9eX80fxjfFaYQyD0Fyb5VwFA6Cd/Xj7830o= =MiIS -----END PGP SIGNATURE----- -------------- next part -------------- A non-text attachment was scrubbed... Name: martin_raspaud.vcf Type: text/x-vcard Size: 260 bytes Desc: not available URL: From faltet at pytables.org Fri May 7 03:05:44 2010 From: faltet at pytables.org (Francesc Alted) Date: Fri, 7 May 2010 09:05:44 +0200 Subject: [Numpy-discussion] Decision tree-like algorithm on numpy arrays In-Reply-To: <4BE3B0C4.1060205@smhi.se> References: <4BE266B9.4050804@smhi.se> <201005061707.34118.faltet@pytables.org> <4BE3B0C4.1060205@smhi.se> Message-ID: <201005070905.44361.faltet@pytables.org> A Friday 07 May 2010 08:18:44 Martin Raspaud escrigu?: > Francesc Alted skrev: > > Hi Martin, > > [...] > > > and the output for my machine: > > > > result_array1: [4 2 4 ..., 1 3 4] 1.819 > > result_array2: [4 2 4 ..., 1 3 4] 0.308 > > > > which is a 6x speed-up. I suppose this should be pretty close of what > > you can get with C. > > Hi Francesc, > Thanks a lot for the idea ! > This looks nice, I wasn't aware of numexpr. > > I guess it's no problem to run "evaluate" on user defined functions ? No problem. You only have to express your functions in terms of numexpr expressions. Look at this simple example: #------------------------------------------------------------------ import numpy as np import numexpr as ne N = 1e7 array1 = np.random.random(N) array2 = np.random.random(N) array3 = np.random.random(N) # An user-defined function def some_test_on(arr, value): return arr > value result_array1 = np.where(some_test_on(array1, 0.6), np.where(some_test_on(array2, 0.5), 1, 2), np.where(some_test_on(array3, 0.3), 3, 4)) print "result_array1:", result_array1 # The same user-defined function than above, # but return a numexpr expression instead def ne_some_test_on(arr, value): return "(" + arr + " > %s" % value + ")" expr1 = "where("+ne_some_test_on("array2", 0.5) + ", 1, 2)" expr2 = "where("+ne_some_test_on("array3", 0.3) + ", 3, 4)" expr = "where("+ne_some_test_on("array1", 0.6) + ", %s, %s)" % (expr1, expr2) result_array2 = ne.evaluate(expr) print "result_array2:", result_array2 assert np.allclose(result_array1, result_array2) #------------------------------------------------------------------ and the output: result_array1: [2 3 4 ..., 3 1 3] result_array2: [2 3 4 ..., 3 1 3] -- Francesc Alted From friedrichromstedt at gmail.com Fri May 7 14:36:14 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Fri, 7 May 2010 20:36:14 +0200 Subject: [Numpy-discussion] saving object array to ascii file In-Reply-To: <42D29088-C5A3-44E2-8C10-9185765A10C9@usc.edu> References: <42D29088-C5A3-44E2-8C10-9185765A10C9@usc.edu> Message-ID: 2010/5/7 Maria Liukis : > Sorry if it's a trivial question. I'm trying to find out if there is a way to save object array to ascii file. numpy.savetxt() in NumPy V1.3.0 doesn't seem to work: Maybe according to http://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html#numpy.savetxt the *fmt* argument can help here: >>> numpy.savetxt('obj_array.dat', obj_arr, fmt = '%s') This uses Python str() to convert the elements into Python strings. You may also use '%r' which uses repr() instead. I think this is the only way because noone except the objects themselves know how to be printed. Friedrich From charlesr.harris at gmail.com Sat May 8 20:44:59 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 8 May 2010 18:44:59 -0600 Subject: [Numpy-discussion] Datetime overflow error, attn Stefan. Message-ID: Hi Stefan, The windows buildbot throws the error ===================================================================== FAIL: test_creation_overflow (test_datetime.TestDateTime) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\buildbot\numpy\b11\numpy-install25\Lib\site-packages\numpy\core\tests\test_datetime.py", line 68, in test_creation_overflow err_msg='Datetime conversion error for unit %s' % unit) File "..\numpy-install25\Lib\site-packages\numpy\testing\utils.py", line 313, in assert_equal AssertionError: Items are not equal: Datetime conversion error for unit ms ACTUAL: 567052800 DESIRED: 322689600000 Because window's longs are always 32 bit, I think this is a good indication that somewhere a long is being used instead of an intp. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Sat May 8 20:52:09 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Sat, 8 May 2010 19:52:09 -0500 Subject: [Numpy-discussion] Another masked array question Message-ID: Hello, Consider my masked arrays: I[28]: type basic.data['Air_Temp'] -----> type(basic.data['Air_Temp']) O[28]: numpy.ma.core.MaskedArray I[29]: basic.data['Air_Temp'] O[29]: masked_array(data = [-- -- -- ..., -- -- --], mask = [ True True True ..., True True True], fill_value = 999999.9999) I[17]: basic.data['Air_Temp'].data = np.ones(len(basic.data['Air_Temp']))*30 --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) ----> 1 2 3 4 5 AttributeError: can't set attribute Why this assignment fails? I want to set each element in the original basic.data['Air_Temp'].data to another value. (Because the main instrument was forgotten to turn on for that day, and I am using a secondary measurement data for Air Temperature for my another calculation. However it fails. Although single assignment works: I[13]: basic.data['Air_Temp'].data[0] = 30 Shouldn't this be working like the regular NumPy arrays do? Thanks. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 8 21:06:01 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 8 May 2010 19:06:01 -0600 Subject: [Numpy-discussion] Datetime overflow error, attn Stefan. In-Reply-To: References: Message-ID: On Sat, May 8, 2010 at 6:44 PM, Charles R Harris wrote: > Hi Stefan, > > The windows buildbot throws the error > > > ===================================================================== > FAIL: test_creation_overflow (test_datetime.TestDateTime) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "C:\buildbot\numpy\b11\numpy-install25\Lib\site-packages\numpy\core\tests\test_datetime.py", line 68, in test_creation_overflow > err_msg='Datetime conversion error for unit %s' % unit) > File "..\numpy-install25\Lib\site-packages\numpy\testing\utils.py", line 313, in assert_equal > AssertionError: > Items are not equal: Datetime conversion error for unit ms > ACTUAL: 567052800 > DESIRED: 322689600000 > > > Because window's longs are always 32 bit, I think this is a good indication that somewhere a long is being used instead of an intp. > > Probably all references to long should be changed, it just isn't portable. And do you have any idea what is supposed to happen here: if (year >= 0 || -1/4 == -1) I know this isn't your code, but you have been looking at it, so now you are responsible ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Sat May 8 22:16:18 2010 From: rmay31 at gmail.com (Ryan May) Date: Sat, 8 May 2010 21:16:18 -0500 Subject: [Numpy-discussion] Another masked array question In-Reply-To: References: Message-ID: On Sat, May 8, 2010 at 7:52 PM, G?khan Sever wrote: > Hello, > > Consider my masked arrays: > > I[28]: type basic.data['Air_Temp'] > -----> type(basic.data['Air_Temp']) > O[28]: numpy.ma.core.MaskedArray > > I[29]: basic.data['Air_Temp'] > O[29]: > masked_array(data = [-- -- -- ..., -- -- --], > ???????????? mask = [ True? True? True ...,? True? True? True], > ?????? fill_value = 999999.9999) > > > I[17]: basic.data['Air_Temp'].data = np.ones(len(basic.data['Air_Temp']))*30 > --------------------------------------------------------------------------- > AttributeError??????????????????????????? Traceback (most recent call last) > > ----> 1 > ????? 2 > ????? 3 > ????? 4 > ????? 5 > > AttributeError: can't set attribute > > Why this assignment fails? I want to set each element in the original > basic.data['Air_Temp'].data to another value. (Because the main instrument > was forgotten to turn on for that day, and I am using a secondary > measurement data for Air Temperature for my another calculation. However it > fails. Although single assignment works: > > I[13]: basic.data['Air_Temp'].data[0] = 30 > > Shouldn't this be working like the regular NumPy arrays do? Based on the traceback, I'd say it's because you're trying to replace the object pointed to by the .data attribute. Instead, try to just change the bits contained in .data: basic.data['Air_Temp'].data[:] = np.ones(len(basic.data['Air_Temp']))*30 Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From gokhansever at gmail.com Sat May 8 22:27:39 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Sat, 8 May 2010 21:27:39 -0500 Subject: [Numpy-discussion] Another masked array question In-Reply-To: References: Message-ID: On Sat, May 8, 2010 at 9:16 PM, Ryan May wrote: > On Sat, May 8, 2010 at 7:52 PM, G?khan Sever > wrote: > > Hello, > > > > Consider my masked arrays: > > > > I[28]: type basic.data['Air_Temp'] > > -----> type(basic.data['Air_Temp']) > > O[28]: numpy.ma.core.MaskedArray > > > > I[29]: basic.data['Air_Temp'] > > O[29]: > > masked_array(data = [-- -- -- ..., -- -- --], > > mask = [ True True True ..., True True True], > > fill_value = 999999.9999) > > > > > > I[17]: basic.data['Air_Temp'].data = > np.ones(len(basic.data['Air_Temp']))*30 > > > --------------------------------------------------------------------------- > > AttributeError Traceback (most recent call > last) > > > > ----> 1 > > 2 > > 3 > > 4 > > 5 > > > > AttributeError: can't set attribute > > > > Why this assignment fails? I want to set each element in the original > > basic.data['Air_Temp'].data to another value. (Because the main > instrument > > was forgotten to turn on for that day, and I am using a secondary > > measurement data for Air Temperature for my another calculation. However > it > > fails. Although single assignment works: > > > > I[13]: basic.data['Air_Temp'].data[0] = 30 > > > > Shouldn't this be working like the regular NumPy arrays do? > > Based on the traceback, I'd say it's because you're trying to replace > the object pointed to by the .data attribute. Instead, try to just > change the bits contained in .data: > > basic.data['Air_Temp'].data[:] = np.ones(len(basic.data['Air_Temp']))*30 > > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > Thanks for the pointer Ryan. Now it works as it is supposed to be. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Sat May 8 22:29:01 2010 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 08 May 2010 16:29:01 -1000 Subject: [Numpy-discussion] Another masked array question In-Reply-To: References: Message-ID: <4BE61DED.2070707@hawaii.edu> On 05/08/2010 04:16 PM, Ryan May wrote: > On Sat, May 8, 2010 at 7:52 PM, G?khan Sever wrote: >> Hello, >> >> Consider my masked arrays: >> >> I[28]: type basic.data['Air_Temp'] >> -----> type(basic.data['Air_Temp']) >> O[28]: numpy.ma.core.MaskedArray >> >> I[29]: basic.data['Air_Temp'] >> O[29]: >> masked_array(data = [-- -- -- ..., -- -- --], >> mask = [ True True True ..., True True True], >> fill_value = 999999.9999) >> >> >> I[17]: basic.data['Air_Temp'].data = np.ones(len(basic.data['Air_Temp']))*30 >> --------------------------------------------------------------------------- >> AttributeError Traceback (most recent call last) >> >> ----> 1 >> 2 >> 3 >> 4 >> 5 >> >> AttributeError: can't set attribute >> >> Why this assignment fails? I want to set each element in the original >> basic.data['Air_Temp'].data to another value. (Because the main instrument >> was forgotten to turn on for that day, and I am using a secondary >> measurement data for Air Temperature for my another calculation. However it >> fails. Although single assignment works: >> >> I[13]: basic.data['Air_Temp'].data[0] = 30 >> >> Shouldn't this be working like the regular NumPy arrays do? > > Based on the traceback, I'd say it's because you're trying to replace > the object pointed to by the .data attribute. Instead, try to just > change the bits contained in .data: > > basic.data['Air_Temp'].data[:] = np.ones(len(basic.data['Air_Temp']))*30 Also, you since you are setting all elements to a single value, you don't need to generate an array on the right-hand side. And, you don't need to manipulate ".data" directly--I think it is best to avoid doing so. Consider: In [1]:x = np.ma.array([1,2,3], mask=[True, True, True], dtype=float) In [2]:x Out[2]: masked_array(data = [-- -- --], mask = [ True True True], fill_value = 1e+20) In [3]:x[:] = 30 In [4]:x Out[4]: masked_array(data = [30.0 30.0 30.0], mask = [False False False], fill_value = 1e+20) In [5]:x[:] = np.ma.masked In [6]:x Out[6]: masked_array(data = [-- -- --], mask = [ True True True], fill_value = 1e+20) In [7]:x.data Out[7]:array([ 30., 30., 30.]) Eric > > Ryan > From gokhansever at gmail.com Sat May 8 22:51:37 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Sat, 8 May 2010 21:51:37 -0500 Subject: [Numpy-discussion] Another masked array question In-Reply-To: <4BE61DED.2070707@hawaii.edu> References: <4BE61DED.2070707@hawaii.edu> Message-ID: On Sat, May 8, 2010 at 9:29 PM, Eric Firing wrote: > On 05/08/2010 04:16 PM, Ryan May wrote: > > On Sat, May 8, 2010 at 7:52 PM, G?khan Sever > wrote: > >> Hello, > >> > >> Consider my masked arrays: > >> > >> I[28]: type basic.data['Air_Temp'] > >> -----> type(basic.data['Air_Temp']) > >> O[28]: numpy.ma.core.MaskedArray > >> > >> I[29]: basic.data['Air_Temp'] > >> O[29]: > >> masked_array(data = [-- -- -- ..., -- -- --], > >> mask = [ True True True ..., True True True], > >> fill_value = 999999.9999) > >> > >> > >> I[17]: basic.data['Air_Temp'].data = > np.ones(len(basic.data['Air_Temp']))*30 > >> > --------------------------------------------------------------------------- > >> AttributeError Traceback (most recent call > last) > >> > >> ----> 1 > >> 2 > >> 3 > >> 4 > >> 5 > >> > >> AttributeError: can't set attribute > >> > >> Why this assignment fails? I want to set each element in the original > >> basic.data['Air_Temp'].data to another value. (Because the main > instrument > >> was forgotten to turn on for that day, and I am using a secondary > >> measurement data for Air Temperature for my another calculation. However > it > >> fails. Although single assignment works: > >> > >> I[13]: basic.data['Air_Temp'].data[0] = 30 > >> > >> Shouldn't this be working like the regular NumPy arrays do? > > > > Based on the traceback, I'd say it's because you're trying to replace > > the object pointed to by the .data attribute. Instead, try to just > > change the bits contained in .data: > > > > basic.data['Air_Temp'].data[:] = np.ones(len(basic.data['Air_Temp']))*30 > > Also, you since you are setting all elements to a single value, you > don't need to generate an array on the right-hand side. And, you don't > need to manipulate ".data" directly--I think it is best to avoid doing > so. Consider: > > In [1]:x = np.ma.array([1,2,3], mask=[True, True, True], dtype=float) > > In [2]:x > Out[2]: > masked_array(data = [-- -- --], > mask = [ True True True], > fill_value = 1e+20) > > > In [3]:x[:] = 30 > > In [4]:x > Out[4]: > masked_array(data = [30.0 30.0 30.0], > mask = [False False False], > fill_value = 1e+20) > > > In [5]:x[:] = np.ma.masked > > In [6]:x > Out[6]: > masked_array(data = [-- -- --], > mask = [ True True True], > fill_value = 1e+20) > > > In [7]:x.data > Out[7]:array([ 30., 30., 30.]) > > > Eric > > > > > Ryan > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Good to see this :) I[45]: x = np.ma.array([1,2,3], mask=[True, True, True], dtype=float) I[46]: x O[46]: masked_array(data = [-- -- --], mask = [ True True True], fill_value = 1e+20) I[47]: x.data[:] = 25 I[48]: x O[48]: masked_array(data = [-- -- --], mask = [ True True True], fill_value = 1e+20) I[49]: x[:] = 25 I[50]: x O[50]: masked_array(data = [25.0 25.0 25.0], mask = [False False False], fill_value = 1e+20) I was also updating mask values after updating data attribute. Now setting the masked array itself to a number automatically flips the masks for me which is very useful. I check if a valid temperature exists, otherwise assign my calculation to another missing value. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjhnson at gmail.com Sun May 9 01:01:22 2010 From: tjhnson at gmail.com (T J) Date: Sat, 8 May 2010 22:01:22 -0700 Subject: [Numpy-discussion] pareto docstring Message-ID: The docstring for np.pareto says: This is a simplified version of the Generalized Pareto distribution (available in SciPy), with the scale set to one and the location set to zero. Most authors default the location to one. and also: The probability density for the Pareto distribution is .. math:: p(x) = \frac{am^a}{x^{a+1}} where :math:`a` is the shape and :math:`m` the location These two statements seem to be in contradiction. I think what was meant is that m is the scale, rather than the location. For if m were equal to zero, as the first portion of the docstring states, then the entire pdf would be zero for all shapes a>0. ---- Also, I'm not quite understanding how the stated pdf is actually the same as the pdf for the generalized pareto with the scale=1 and location=0. By the wikipedia definition of the generalized Pareto distribution, if we take \sigma=1 (scale equal to one) and \mu=0 (location equal to zero), then we get: (1 + a x)^(-1/a - 1) which is normalized over $x \in (0, \infty)$. If we compare this to the distribution stated in the docstring (with m=1) a x^{-a-1} we see that it is normalized over $x \in (1, \infty)$. And indeed, the distribution requires x > scale = 1. If we integrate the generalized Pareto (with scale=1, location=0) over $x \in (1, \infty)$ then we have to re-normalize. So should the docstring say: This is a simplified version of the Generalized Pareto distribution (available in Scipy), with the scale set to one, the location set to zero, and the distribution re-normalized over the range (1, \infty). Most authors default the location to one. From josef.pktd at gmail.com Sun May 9 07:49:00 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 9 May 2010 07:49:00 -0400 Subject: [Numpy-discussion] pareto docstring In-Reply-To: References: Message-ID: On Sun, May 9, 2010 at 1:01 AM, T J wrote: > The docstring for np.pareto says: > > ? ?This is a simplified version of the Generalized Pareto distribution > ? ?(available in SciPy), with the scale set to one and the location set to > ? ?zero. Most authors default the location to one. > > and also: > > ? ?The probability density for the Pareto distribution is > > ? ?.. math:: p(x) = \frac{am^a}{x^{a+1}} > > ? ?where :math:`a` is the shape and :math:`m` the location > > These two statements seem to be in contradiction. ?I think what was > meant is that m is the scale, rather than the location. ?For if m were > equal to zero, as the first portion of the docstring states, then the > entire pdf would be zero for all shapes a>0. > > > ---- > > Also, ?I'm not quite understanding how the stated pdf is actually the > same as the pdf for the generalized pareto with the scale=1 and > location=0. ?By the wikipedia definition of the generalized Pareto > distribution, if we take \sigma=1 (scale equal to one) ?and \mu=0 > (location equal to zero), then we get: > > (1 + a x)^(-1/a - 1) > > which is normalized over $x \in (0, \infty)$. ?If we compare this to > the distribution stated in the docstring (with m=1) > > a x^{-a-1} > > we see that it is normalized over $x \in (1, \infty)$. ?And indeed, > the distribution requires x > scale = 1. > > If we integrate the generalized Pareto (with scale=1, location=0) over > $x \in (1, \infty)$ then we have to re-normalize. ?So should the > docstring say: > > ? This is a simplified version of the Generalized Pareto distribution > ? (available in Scipy), with the scale set to one, the location set to zero, > ? and the distribution re-normalized over the range (1, \infty). Most > ? authors default the location to one. I think this is the same point, I was trying to make last year. Instead of renormalizing, my conclusion was the following, (copied from the mailinglist August last year) """ my conclusion: --------------------- What numpy.random.pareto actually produces, are random numbers from a pareto distribution with lower bound m=1, but location parameter loc=-1, that shifts the distribution to the left. To actually get useful random numbers (that are correct in the usual usage http://en.wikipedia.org/wiki/Pareto_distribution), we need to add 1 to them. stats.distributions doesn't use mtrand.pareto rvs_pareto = 1 + numpy.random.pareto(a, size) """ I still have to work though the math of your argument, but maybe we can come to an agreement how the docstrings (or the function) should be changed, and what numpy.random.pareto really means. Josef (grateful, that there are another set of eyes on this) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Sun May 9 08:34:00 2010 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 9 May 2010 12:34:00 +0000 (UTC) Subject: [Numpy-discussion] Datetime overflow error, attn Stefan. References: Message-ID: Sat, 08 May 2010 18:44:59 -0600, Charles R Harris wrote: [clip] > Because window's longs are always 32 bit, I think this is a good > indication that somewhere a long is being used instead of an intp. The test itself used 32-bit integers. Fix'd. Pauli From pgmdevlist at gmail.com Fri May 7 16:28:55 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 7 May 2010 15:28:55 -0500 Subject: [Numpy-discussion] Question about numpy.ma masking In-Reply-To: References: Message-ID: On May 4, 2010, at 8:38 PM, G?khan Sever wrote: > Hello, > > I have the following arrays read as masked array. > > I[10]: basic.data['Air_Temp'].mask > O[10]: array([ True, False, False, ..., False, False, False], dtype=bool) > > [12]: basic.data['Press_Alt'].mask > O[12]: False > > I[13]: len basic.data['Air_Temp'] > -----> len(basic.data['Air_Temp']) > O[13]: 1758 > > > The first item data['Air_Temp'] has only the first element masked and this result with mask attribute being created an equal data length bool array. On the other hand data['Press_Alt'] has no elements to mask yielding a 'False' scalar. Is this a documented behavior or intentionally designed this way? This is the only case out of 20 that breaks my code as following: :) > > IndexError Traceback (most recent call last) > > 130 for k in range(len(shorter)): > 131 if (serialh.data['dccnTempSF'][k] != 0) \ > --> 132 and (basic.data['Air_Temp'].mask[k+diff] == False): > 133 dccnConAmb[k] = serialc.data['dccnConc'][k] * \ > 134 physical.data['STATIC_PR'][k+diff] * \ > > IndexError: invalid index to scalar variable. > > since mask is a scalar in this case, nothing to loop terminating with an IndexError. Gokhan, Sorry for not getting back sooner, web connectivity was limited on my side. I must admit I can't really see what you're tring to do here, but I'll throw some random comments: * If you're using structured MaskedArrays, it's a really bad idea to call one of the fields "data", as it may interact in a non-obvious way with the actual "data" property (the one that outputs a view of the array as a pure ndarray). * if you need to test whether an array has some masked elements, try something like >>> myarray.mask is nomask If True, no item is masked, the mask is a boolean and you can move on. If False, then the mask is a ndarray w/ as many elements as the array and you can index it. From d.l.goldsmith at gmail.com Sun May 9 12:57:19 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sun, 9 May 2010 09:57:19 -0700 Subject: [Numpy-discussion] pareto docstring In-Reply-To: References: Message-ID: On Sun, May 9, 2010 at 4:49 AM, wrote: > On Sun, May 9, 2010 at 1:01 AM, T J wrote: > > The docstring for np.pareto says: > > > > This is a simplified version of the Generalized Pareto distribution > > (available in SciPy), with the scale set to one and the location set > to > > zero. Most authors default the location to one. > > > > and also: > > > > The probability density for the Pareto distribution is > > > > .. math:: p(x) = \frac{am^a}{x^{a+1}} > > > > where :math:`a` is the shape and :math:`m` the location > > > > These two statements seem to be in contradiction. I think what was > > meant is that m is the scale, rather than the location. For if m were > > equal to zero, as the first portion of the docstring states, then the > > entire pdf would be zero for all shapes a>0. > > > > ---- > > > > Also, I'm not quite understanding how the stated pdf is actually the > > same as the pdf for the generalized pareto with the scale=1 and > > location=0. By the wikipedia definition of the generalized Pareto > > distribution, if we take \sigma=1 (scale equal to one) and \mu=0 > > (location equal to zero), then we get: > > > > (1 + a x)^(-1/a - 1) > > > > which is normalized over $x \in (0, \infty)$. If we compare this to > > the distribution stated in the docstring (with m=1) > > > > a x^{-a-1} > > > > we see that it is normalized over $x \in (1, \infty)$. And indeed, > > the distribution requires x > scale = 1. > > > > If we integrate the generalized Pareto (with scale=1, location=0) over > > $x \in (1, \infty)$ then we have to re-normalize. So should the > > docstring say: > > > > This is a simplified version of the Generalized Pareto distribution > > (available in Scipy), with the scale set to one, the location set to > zero, > > and the distribution re-normalized over the range (1, \infty). Most > > authors default the location to one. > > I think this is the same point, I was trying to make last year. > > Instead of renormalizing, my conclusion was the following, > (copied from the mailinglist August last year) > > """ > my conclusion: > --------------------- > What numpy.random.pareto actually produces, are random numbers from a > pareto distribution with lower bound m=1, but location parameter > loc=-1, that shifts the distribution to the left. > > To actually get useful random numbers (that are correct in the usual > usage http://en.wikipedia.org/wiki/Pareto_distribution), we need to > add 1 to them. > stats.distributions doesn't use mtrand.pareto > is thr > rvs_pareto = 1 + numpy.random.pareto(a, size) > > """ > > I still have to work though the math of your argument, but maybe we > can come to an agreement how the docstrings (or the function) should > be changed, and what numpy.random.pareto really means. > > Josef > (grateful, that there are another set of eyes on this) > > Maybe this is obvious, but what I would suggest is: 0) Determine precisely what the existing code is doing 1) Decide if that is the desired behavior 2) If it is, make the docstring conform 3) If it isn't, file a "bug" ticket 4) If disposed to do so, fix the code 4) Ensure that the docstring documents the desired behavior (The last two have the same number because they may be done in either order, or even concurrently.) Again, sorry if the above is obvious... DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Sun May 9 15:01:27 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Sun, 9 May 2010 14:01:27 -0500 Subject: [Numpy-discussion] Question about numpy.ma masking In-Reply-To: References: Message-ID: On Fri, May 7, 2010 at 3:28 PM, Pierre GM wrote: > On May 4, 2010, at 8:38 PM, G?khan Sever wrote: > > Hello, > > > > I have the following arrays read as masked array. > > > > I[10]: basic.data['Air_Temp'].mask > > O[10]: array([ True, False, False, ..., False, False, False], dtype=bool) > > > > [12]: basic.data['Press_Alt'].mask > > O[12]: False > > > > I[13]: len basic.data['Air_Temp'] > > -----> len(basic.data['Air_Temp']) > > O[13]: 1758 > > > > > > The first item data['Air_Temp'] has only the first element masked and > this result with mask attribute being created an equal data length bool > array. On the other hand data['Press_Alt'] has no elements to mask yielding > a 'False' scalar. Is this a documented behavior or intentionally designed > this way? This is the only case out of 20 that breaks my code as following: > :) > > > > IndexError Traceback (most recent call > last) > > > > 130 for k in range(len(shorter)): > > 131 if (serialh.data['dccnTempSF'][k] != 0) \ > > --> 132 and (basic.data['Air_Temp'].mask[k+diff] == False): > > 133 dccnConAmb[k] = serialc.data['dccnConc'][k] * \ > > 134 physical.data['STATIC_PR'][k+diff] * \ > > > > IndexError: invalid index to scalar variable. > > > > since mask is a scalar in this case, nothing to loop terminating with an > IndexError. > > > Gokhan, > Sorry for not getting back sooner, web connectivity was limited on my side. > I must admit I can't really see what you're tring to do here, but I'll > throw some random comments: > * If you're using structured MaskedArrays, it's a really bad idea to call > one of the fields "data", as it may interact in a non-obvious way with the > actual "data" property (the one that outputs a view of the array as a pure > ndarray). > Hello Pierre, basic.data is a dictionary containing all masked array items. When I read the original data into scripts, my main constructor-reader class automatically converts data to masked arrays. basic.data['Air_Temp'] is a masked array itself, little confusing for sure it also has 'data' attribute. In the above example I check one condition looping in mask value. When mask attribute isn't an bool-array (when there is no missing value in data) the condition fails asserting an IndexError. I was wondering why it doesn't yield a bool-array instead of giving me a scalar False. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Sun May 9 15:42:15 2010 From: efiring at hawaii.edu (Eric Firing) Date: Sun, 09 May 2010 09:42:15 -1000 Subject: [Numpy-discussion] Question about numpy.ma masking In-Reply-To: References: Message-ID: <4BE71017.6010000@hawaii.edu> On 05/09/2010 09:01 AM, G?khan Sever wrote: > > > On Fri, May 7, 2010 at 3:28 PM, Pierre GM > wrote: > > On May 4, 2010, at 8:38 PM, G?khan Sever wrote: > > Hello, > > > > I have the following arrays read as masked array. > > > > I[10]: basic.data['Air_Temp'].mask > > O[10]: array([ True, False, False, ..., False, False, False], > dtype=bool) > > > > [12]: basic.data['Press_Alt'].mask > > O[12]: False > > > > I[13]: len basic.data['Air_Temp'] > > -----> len(basic.data['Air_Temp']) > > O[13]: 1758 > > > > > > The first item data['Air_Temp'] has only the first element masked > and this result with mask attribute being created an equal data > length bool array. On the other hand data['Press_Alt'] has no > elements to mask yielding a 'False' scalar. Is this a documented > behavior or intentionally designed this way? This is the only case > out of 20 that breaks my code as following: :) > > > > IndexError Traceback (most recent > call last) > > > > 130 for k in range(len(shorter)): > > 131 if (serialh.data['dccnTempSF'][k] != 0) \ > > --> 132 and (basic.data['Air_Temp'].mask[k+diff] == False): > > 133 dccnConAmb[k] = serialc.data['dccnConc'][k] * \ > > 134 > physical.data['STATIC_PR'][k+diff] * \ > > > > IndexError: invalid index to scalar variable. > > > > since mask is a scalar in this case, nothing to loop terminating > with an IndexError. > > > Gokhan, > Sorry for not getting back sooner, web connectivity was limited on > my side. > I must admit I can't really see what you're tring to do here, but > I'll throw some random comments: > * If you're using structured MaskedArrays, it's a really bad idea to > call one of the fields "data", as it may interact in a non-obvious > way with the actual "data" property (the one that outputs a view of > the array as a pure ndarray). > > > Hello Pierre, > > basic.data is a dictionary containing all masked array items. When I > read the original data into scripts, my main constructor-reader class > automatically converts data to masked arrays. basic.data['Air_Temp'] is > a masked array itself, little confusing for sure it also has 'data' > attribute. > > In the above example I check one condition looping in mask value. When > mask attribute isn't an bool-array (when there is no missing value in > data) the condition fails asserting an IndexError. I was wondering why > it doesn't yield a bool-array instead of giving me a scalar False. The mask attribute can be a full array, or it can be a scalar to indicate that nothing is masked. This is an optimization in masked arrays; it adds complexity, but it can save space and/or processing time. You can always access a full mask array by using np.ma.getmaskarray(). Or you can ensure the internal mask is an array, not a scalar, by using the shrink=False kwarg when making the masked array with np.ma.array(). Offhand, I suspect your loop can be eliminated by vectorization. Something like this: ns = len(shorter) slice0 = slice(ns) slice1 = slice(diff, diff+ns) cond1 = serialh.data['dccnTempSF'][slice0] != 0 cond2 = np.ma.getmaskarray(basic.data['Air_Temp'][slice1]) == False cond = cond1 & cond2 dccnConAmb[slice0][cond] = (serialc.data['dccnConc'][slice0][cond] * physical.data['STATIC_PR'][slice1][cond]) Eric From neilcrighton at gmail.com Mon May 10 11:23:12 2010 From: neilcrighton at gmail.com (Neil Crighton) Date: Mon, 10 May 2010 15:23:12 +0000 (UTC) Subject: [Numpy-discussion] chararray stripping trailing whitespace a bug? Message-ID: I've been working with pyfits, which uses numpy chararrays. I've discovered the hard way that chararrays silently remove trailing whitespace: >>> a = np.array(['a ']) >>> b = a.view(np.chararray) >>> a[0] 'a ' >>> b[0] 'a' Note the string values stored in memory are unchanged. This behaviour caused a bug in a program I've been writing, and seems like a bad idea in general. Is it intentional? Neil From warren.weckesser at enthought.com Mon May 10 11:29:55 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 10 May 2010 10:29:55 -0500 Subject: [Numpy-discussion] chararray stripping trailing whitespace a bug? In-Reply-To: References: Message-ID: <4BE82673.4050501@enthought.com> From the chararray docstring: Versus a regular Numpy array of type `str` or `unicode`, this class adds the following functionality: 1) values automatically have whitespace removed from the end when indexed So I guess it is a feature, not a bug. :) Warren Neil Crighton wrote: > I've been working with pyfits, which uses numpy chararrays. I've discovered the > hard way that chararrays silently remove trailing whitespace: > > >>>> a = np.array(['a ']) >>>> b = a.view(np.chararray) >>>> a[0] >>>> > 'a ' > >>>> b[0] >>>> > 'a' > > Note the string values stored in memory are unchanged. This behaviour caused a > bug in a program I've been writing, and seems like a bad idea in general. Is it > intentional? > > Neil > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From chanley at stsci.edu Mon May 10 11:40:45 2010 From: chanley at stsci.edu (Christopher Hanley) Date: Mon, 10 May 2010 11:40:45 -0400 Subject: [Numpy-discussion] chararray stripping trailing whitespace a bug? In-Reply-To: References: Message-ID: On Mon, May 10, 2010 at 11:23 AM, Neil Crighton wrote: > I've been working with pyfits, which uses numpy chararrays. I've discovered the > hard way that chararrays silently remove trailing whitespace: > >>>> a = np.array(['a ']) >>>> b = a.view(np.chararray) >>>> a[0] > 'a ' >>>> b[0] > 'a' > > Note the string values stored in memory are unchanged. This behaviour caused a > bug in a program I've been writing, and seems like a bad idea in general. Is it > intentional? > > Neil This is an intentional "feature", not a bug. Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From gokhansever at gmail.com Mon May 10 12:17:58 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 10 May 2010 11:17:58 -0500 Subject: [Numpy-discussion] Question about numpy.ma masking In-Reply-To: <4BE71017.6010000@hawaii.edu> References: <4BE71017.6010000@hawaii.edu> Message-ID: On Sun, May 9, 2010 at 2:42 PM, Eric Firing wrote: > > The mask attribute can be a full array, or it can be a scalar to > indicate that nothing is masked. This is an optimization in masked > arrays; it adds complexity, but it can save space and/or processing > time. You can always access a full mask array by using > np.ma.getmaskarray(). Or you can ensure the internal mask is an array, > not a scalar, by using the shrink=False kwarg when making the masked > array with np.ma.array(). > shrink=False fits perfect for my use-case. I was guessing that leaving the mask as scalar should something to do with optimization. Probably not many people around write loops and check conditions based on the mask content like I do :) I hope someone in SciPy10 will present a Numpy.MA talk or tutorial describing all the nitty details of the module usage. > > Offhand, I suspect your loop can be eliminated by vectorization. > Something like this: > > ns = len(shorter) > slice0 = slice(ns) > slice1 = slice(diff, diff+ns) > cond1 = serialh.data['dccnTempSF'][slice0] != 0 > cond2 = np.ma.getmaskarray(basic.data['Air_Temp'][slice1]) == False > cond = cond1 & cond2 > dccnConAmb[slice0][cond] = (serialc.data['dccnConc'][slice0][cond] * > physical.data['STATIC_PR'][slice1][cond]) > Bonus help :) My gmail has over 400 Python tagged e-mails collected over a year. I get responses here (in mailing lists general) most of the time faster than I get locally around my department. This (especially no-appointments feature) doubles triples my learning experience. Just a personal thanks to you and all who make these great mediums possible. Anyways back to the topic again. The snippet I share is about a year old from the times that I didn't know much about vectorization. Your version looks good to my eyes, but it is little harder to read in general. Also I don't know how would you debug this code. Sometimes I need to pause the execution of scripts and step-by-step move through the lines and see how values are changing in each iteration. Lastly, this dccnConAmb is my CCN concentration normalized at ambient pressure and temperature that I use to estimate C and k parameters from power-law relationship using scipy's curve_fit() in case someone is curious what I am after. > > Eric > > G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From neilcrighton at gmail.com Mon May 10 13:20:18 2010 From: neilcrighton at gmail.com (Neil Crighton) Date: Mon, 10 May 2010 17:20:18 +0000 (UTC) Subject: [Numpy-discussion] chararray stripping trailing whitespace a bug? References: Message-ID: > > This is an intentional "feature", not a bug. > > Chris > Ah, ok, thanks. I missed the explanation in the doc string because I'm using version 1.3 and forgot to check the web docs. For the record, this was my bug: I read a fits binary table with pyfits. One of the table fields was a chararray containing a bunch of flags ('A','B','C','D'). I tried to use in1d() to identify all entries with flags of 'C' or 'D'. So >>> c = pyfits_table.chararray_column >>> mask = np.in1d(c, ['C', 'D']) It turns out the actual stored values in the chararray were 'A ', 'B ', 'C ' and 'D '. in1d() converts the chararray to an ndarray before performing the comparison, so none of the entries matches 'C' or 'D'. What is the best way to ensure this doesn't happen to other people? We could change the array set operations to special-case chararrays, but this seems like an ugly solution. Is it possible to change something in pyfits to avoid this? Neil From tjhnson at gmail.com Mon May 10 14:14:47 2010 From: tjhnson at gmail.com (T J) Date: Mon, 10 May 2010 11:14:47 -0700 Subject: [Numpy-discussion] pareto docstring In-Reply-To: References: Message-ID: On Sun, May 9, 2010 at 4:49 AM, wrote: > > I think this is the same point, I was trying to make last year. > > Instead of renormalizing, my conclusion was the following, > (copied from the mailinglist August last year) > > """ > my conclusion: > --------------------- > What numpy.random.pareto actually produces, are random numbers from a > pareto distribution with lower bound m=1, but location parameter > loc=-1, that shifts the distribution to the left. > > To actually get useful ?random numbers (that are correct in the usual > usage http://en.wikipedia.org/wiki/Pareto_distribution), we need to > add 1 to them. > stats.distributions doesn't use mtrand.pareto > > rvs_pareto = 1 + numpy.random.pareto(a, size) > > """ > > I still have to work though the math of your argument, but maybe we > can come to an agreement how the docstrings (or the function) should > be changed, and what numpy.random.pareto really means. > > Josef > (grateful, that there are another set of eyes on this) > > Yes, I think my "renormalizing" statement is incorrect as it is really just sampling from a different pdf altogether. See the following image: http://www.dumpt.com/img/viewer.php?file=q9tfk7ehxsw865vn067c.png It plots histograms of the various implementations against the pdfs. Summarizing: The NumPy implementation is based on (Devroye p. 262). The pdf listed there is: a / (1+x)^(a+1) This differs from the "standard" Pareto pdf: a / x^(a+1) It also differs from the pdf of the generalized Pareto distribution, with scale=1 and location=0: (1 + a x)^(-1/a - 1) And it also differs from the pdf of the generalized Pareto distribution with scale=1 and location=-1 or location=1. random.paretovariate and scipy.stats.pareto sample from the standard Pareto, and this is the desired behavior, IMO. Its true that "1 + np.random.pareto" provides the fix, but I think we're better off changing the underlying implementation. Devroye has a more recent paper: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.85.8760 which states the Pareto distribution in the standard way. So I think it is safe to make this change. Backwards compatibility might be the only argument for not making this change. So here is my proposal: 1) Remove every mention of the generalized Pareto distribution from the docstring. As far as I can see, the generalized Pareto distribution does not reduce to the "standard" Pareto at all. We can still mention scipy.stats.distributions.genpareto and scipy.stats.distributions.pareto. The former is something different and the latter will (now) be equivalent to the NumPy function. 2) Modify numpy/random/mtrand/distributions.c in the following way: double rk_pareto(rk_state *state, double a) { //return exp(rk_standard_exponential(state)/a) - 1; return 1.0 / rk_double(state)**(1.0 / a); } Does this sound good? From pgmdevlist at gmail.com Sun May 9 20:16:32 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 9 May 2010 19:16:32 -0500 Subject: [Numpy-discussion] Another masked array question In-Reply-To: References: <4BE61DED.2070707@hawaii.edu> Message-ID: <3D0C4CAB-63DB-4668-B22E-A27D4B245B1E@gmail.com> On May 8, 2010, at 9:51 PM, G?khan Sever wrote: > > > > On Sat, May 8, 2010 at 9:29 PM, Eric Firing wrote: > On 05/08/2010 04:16 PM, Ryan May wrote: > > On Sat, May 8, 2010 at 7:52 PM, G?khan Sever wrote: > >> > >> AttributeError: can't set attribute > >> > >> Why this assignment fails? I want to set each element in the original > >> basic.data['Air_Temp'].data to another value. (Because the main instrument > >> was forgotten to turn on for that day, and I am using a secondary > >> measurement data for Air Temperature for my another calculation. However it > >> fails. Although single assignment works: > >> > >> I[13]: basic.data['Air_Temp'].data[0] = 30 > >> > >> Shouldn't this be working like the regular NumPy arrays do? > > > > Based on the traceback, I'd say it's because you're trying to replace > > the object pointed to by the .data attribute. Instead, try to just > > change the bits contained in .data: > > > > basic.data['Air_Temp'].data[:] = np.ones(len(basic.data['Air_Temp']))*30 > > Also, you since you are setting all elements to a single value, you > don't need to generate an array on the right-hand side. And, you don't > need to manipulate ".data" directly--I think it is best to avoid doing > so. Consider: Yep. The "data" attribute is in fact a read-only property that retuns a view of the masked array as a standard ndarray. If you need to set individual values, just do so on the masked array. If you need to mask a value, use the syntax >>> yourarray[yourindex] = np.ma.masked From mdroe at stsci.edu Mon May 10 16:51:38 2010 From: mdroe at stsci.edu (Michael Droettboom) Date: Mon, 10 May 2010 16:51:38 -0400 Subject: [Numpy-discussion] chararray stripping trailing whitespace a bug? In-Reply-To: References: Message-ID: <4BE871DA.4020801@stsci.edu> Also from the docstring: """ .. note:: The `chararray` class exists for backwards compatibility with Numarray, it is not recommended for new development. Starting from numpy 1.4, if one needs arrays of strings, it is recommended to use arrays of `dtype` `object_`, `string_` or `unicode_`, and use the free functions in the `numpy.char` module for fast vectorized string operations. """ Neil Crighton wrote: > Ah, ok, thanks. I missed the explanation in the doc string because I'm using > version 1.3 and forgot to check the web docs. > > For the record, this was my bug: I read a fits binary table with pyfits. One of > the table fields was a chararray containing a bunch of flags ('A','B','C','D'). > I tried to use in1d() to identify all entries with flags of 'C' or 'D'. So > > >>>> c = pyfits_table.chararray_column >>>> mask = np.in1d(c, ['C', 'D']) >>>> > > It turns out the actual stored values in the chararray were 'A ', 'B ', 'C ' > and 'D '. in1d() converts the chararray to an ndarray before performing the > comparison, so none of the entries matches 'C' or 'D'. > This inconsistency is fixed in Numpy 1.4 (which included a major overhaul of chararrays). in1d will perform the auto whitespace-stripping on chararrays, but not on regular ndarrays of strings. > What is the best way to ensure this doesn't happen to other people? We could > change the array set operations to special-case chararrays, but this seems like > an ugly solution. Is it possible to change something in pyfits to avoid this? > Pyfits continues to use chararray since not doing so would break existing code relying on this behavior. And there are many use cases where this behavior is desirable, particularly with fixed-length strings in tables. The best way to get around it from your code is to cast the chararray pyfits returns to a regular ndarray. The cast does not perform a copy, so should be very efficient: In [6]: from numpy import char In [7]: import numpy as np In [8]: c = char.array(['a ', 'b ']) In [9]: c Out[9]: chararray(['a', 'b'], dtype='|S2') In [10]: np.asarray(c) Out[11]: array(['a ', 'b '], dtype='|S2') I suggest casting between to either chararray or ndarray depending on whether you want the auto-whitespace-stripping behavior. Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From d.l.goldsmith at gmail.com Mon May 10 16:56:15 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 10 May 2010 13:56:15 -0700 Subject: [Numpy-discussion] pareto docstring In-Reply-To: References: Message-ID: On Mon, May 10, 2010 at 11:14 AM, T J wrote: > On Sun, May 9, 2010 at 4:49 AM, wrote: > > > > I think this is the same point, I was trying to make last year. > > > > Instead of renormalizing, my conclusion was the following, > > (copied from the mailinglist August last year) > > > > """ > > my conclusion: > > --------------------- > > What numpy.random.pareto actually produces, are random numbers from a > > pareto distribution with lower bound m=1, but location parameter > > loc=-1, that shifts the distribution to the left. > > > > To actually get useful random numbers (that are correct in the usual > > usage http://en.wikipedia.org/wiki/Pareto_distribution), we need to > > add 1 to them. > > stats.distributions doesn't use mtrand.pareto > > > > rvs_pareto = 1 + numpy.random.pareto(a, size) > > > > """ > > > > I still have to work though the math of your argument, but maybe we > > can come to an agreement how the docstrings (or the function) should > > be changed, and what numpy.random.pareto really means. > > > > Josef > > (grateful, that there are another set of eyes on this) > > > > > > > Yes, I think my "renormalizing" statement is incorrect as it is really > just sampling from a different pdf altogether. See the following image: > > http://www.dumpt.com/img/viewer.php?file=q9tfk7ehxsw865vn067c.png > > It plots histograms of the various implementations against the pdfs. > Summarizing: > > The NumPy implementation is based on (Devroye p. 262). The pdf listed > there is: > > a / (1+x)^(a+1) > > This differs from the "standard" Pareto pdf: > > a / x^(a+1) > > It also differs from the pdf of the generalized Pareto distribution, > with scale=1 and location=0: > > (1 + a x)^(-1/a - 1) > > And it also differs from the pdf of the generalized Pareto > distribution with scale=1 and location=-1 or location=1. > > random.paretovariate and scipy.stats.pareto sample from the standard > Pareto, and this is the desired behavior, IMO. Its true that "1 + > np.random.pareto" provides the fix, but I think we're better off > changing the underlying implementation. Devroye has a more recent > paper: > > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.85.8760 > > which states the Pareto distribution in the standard way. So I think > it is safe to make this change. Backwards compatibility might be the > only argument for not making this change. So here is my proposal: > > 1) Remove every mention of the generalized Pareto distribution from > the docstring. As far as I can see, the generalized Pareto > distribution does not reduce to the "standard" Pareto at all. We can > still mention scipy.stats.distributions.genpareto and > scipy.stats.distributions.pareto. The former is something different > and the latter will (now) be equivalent to the NumPy function. > > 2) Modify numpy/random/mtrand/distributions.c in the following way: > > double rk_pareto(rk_state *state, double a) > { > //return exp(rk_standard_exponential(state)/a) - 1; > return 1.0 / rk_double(state)**(1.0 / a); > } > > Does this sound good? > _______________________________________________ > Whatever the community decides, don't forget to please go through the formal procedure of submitting a "bug" ticket so all of this is recorded in the "right" way in the "right" place. Thanks! DG -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gberbeglia at gmail.com Mon May 10 17:42:42 2010 From: gberbeglia at gmail.com (gerardob) Date: Mon, 10 May 2010 14:42:42 -0700 (PDT) Subject: [Numpy-discussion] check for inequalities on a list Message-ID: <28517353.post@talk.nabble.com> I have three lists of floats of equal lenght: upper_bound, lower_bound and x. I would like to check whether lower_bound[i]<= x[i] <= upper_bound[i] for all i in range(len(x)) Which is the best way to do this? Thanks. -- View this message in context: http://old.nabble.com/check-for-inequalities-on-a-list-tp28517353p28517353.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From aisaac at american.edu Mon May 10 17:47:17 2010 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 10 May 2010 17:47:17 -0400 Subject: [Numpy-discussion] check for inequalities on a list In-Reply-To: <28517353.post@talk.nabble.com> References: <28517353.post@talk.nabble.com> Message-ID: <4BE87EE5.3090901@american.edu> On 5/10/2010 5:42 PM, gerardob wrote: > I would like to check whether lower_bound[i]<= x[i]<= upper_bound[i] for > all i in range(len(x)) >>> import numpy as np >>> l, m, u = np.arange(12).reshape((3,4)) >>> (l <= m) & (m <= u) array([ True, True, True, True], dtype=bool) >>> l[3]=9 >>> (l <= m) & (m <= u) array([ True, True, True, False], dtype=bool) hth, Alan Isaac From neilcrighton at gmail.com Mon May 10 18:14:31 2010 From: neilcrighton at gmail.com (Neil Crighton) Date: Mon, 10 May 2010 22:14:31 +0000 (UTC) Subject: [Numpy-discussion] chararray stripping trailing whitespace a bug? References: <4BE871DA.4020801@stsci.edu> Message-ID: > This inconsistency is fixed in Numpy 1.4 (which included a major > overhaul of chararrays). in1d will perform the auto > whitespace-stripping on chararrays, but not on regular ndarrays of strings. Great, thanks. > Pyfits continues to use chararray since not doing so would break > existing code relying on this behavior. And there are many use cases > where this behavior is desirable, particularly with fixed-length strings > in tables. > > The best way to get around it from your code is to cast the chararray > pyfits returns to a regular ndarray. My problem was I didn't know I needed to get around it :) But thanks for the suggestion, I'll use that in future when I need to switch between chararrays and ndarrays. Neil From pfeldman at verizon.net Mon May 10 18:53:56 2010 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Mon, 10 May 2010 15:53:56 -0700 (PDT) Subject: [Numpy-discussion] efficient way to manage a set of floats? Message-ID: <28518014.post@talk.nabble.com> I have an application that involves managing sets of floats. I can use Python's built-in set type, but a data structure that is optimized for fixed-size objects that can be compared without hashing should be more efficient than a more general set construct. Is something like this available? -- View this message in context: http://old.nabble.com/efficient-way-to-manage-a-set-of-floats--tp28518014p28518014.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From aarchiba at physics.mcgill.ca Mon May 10 19:43:08 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Mon, 10 May 2010 19:43:08 -0400 Subject: [Numpy-discussion] efficient way to manage a set of floats? In-Reply-To: <28518014.post@talk.nabble.com> References: <28518014.post@talk.nabble.com> Message-ID: On 10 May 2010 18:53, Dr. Phillip M. Feldman wrote: > > I have an application that involves managing sets of floats. ?I can use > Python's built-in set type, but a data structure that is optimized for > fixed-size objects that can be compared without hashing should be more > efficient than a more general set construct. ?Is something like this > available? You might not find this as useful as you think - on a 32-bit machine, the space overhead is roughly a 32-bit object pointer or two for each float, plus about twice the number of floats times 32-bit pointers for the table. And hashing might be worthwhile anyway - you could easily have a series of floats with related bit patterns you'd want to scatter all over hash space. Plus python's set object has seen a fair amount of performance tweaing. That said, there's support in numpy for many operations which use a sorted 1D array to represent a set of floats. There's searchsorted for lookup, plus IIRC union and intersection operators; I'm not sure about set difference. The big thing missing is updates, though if you can batch them, concatenate followed by sorting should be reasonable. Removal can be done with fancy indexing, though again batching is recommended. Maybe these should be regarded as analogous to python's frozensets. In terms of speed, the numpy functions are obviously not as asymptotically efficient as hash tables, though I suspect memory coherence is more of an issue than O(1) versus O(log(n)). The numpy functions allow vectorized lookups and vector operations on sets, which could be handy. Anne P.S. if you want sets with fuzzy queries, it occurs to me that scipy's kdtrees will actually do an okay job, and in compiled code. No updates there either, though. -A > View this message in context: http://old.nabble.com/efficient-way-to-manage-a-set-of-floats--tp28518014p28518014.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pfeldman at verizon.net Mon May 10 21:56:17 2010 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Mon, 10 May 2010 18:56:17 -0700 (PDT) Subject: [Numpy-discussion] efficient way to manage a set of floats? In-Reply-To: References: <28518014.post@talk.nabble.com> Message-ID: <28519085.post@talk.nabble.com> Anne Archibald-2 wrote: > > on a 32-bit machine, > the space overhead is roughly a 32-bit object pointer or two for each > float, plus about twice the number of floats times 32-bit pointers for > the table. > Hello Anne, I'm a bit confused by the above. It sounds as though the hash table approach might occupy 4 times as much storage as a single array. Is that right? Also, I don't understand why hashing would be useful for the set application. It seems as though a red-black tree might be a good implementation for a set of floats if all that one wants to do is add, delete, and test membership. (I will also need to do unions). Thanks for the help! Phillip -- View this message in context: http://old.nabble.com/efficient-way-to-manage-a-set-of-floats--tp28518014p28519085.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From warren.weckesser at enthought.com Mon May 10 22:44:27 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 10 May 2010 21:44:27 -0500 Subject: [Numpy-discussion] efficient way to manage a set of floats? In-Reply-To: <28519085.post@talk.nabble.com> References: <28518014.post@talk.nabble.com> <28519085.post@talk.nabble.com> Message-ID: <4BE8C48B.4090209@enthought.com> Dr. Phillip M. Feldman wrote: > Anne Archibald-2 wrote: > >> on a 32-bit machine, >> the space overhead is roughly a 32-bit object pointer or two for each >> float, plus about twice the number of floats times 32-bit pointers for >> the table. >> >> > > Hello Anne, > > I'm a bit confused by the above. It sounds as though the hash table > approach might occupy 4 times as much storage as a single array. Is that > right? > > Also, I don't understand why hashing would be useful for the set > application. > > It seems as though a red-black tree might be a good implementation for a set > of floats if all that one wants to do is add, delete, and test membership. > (I will also need to do unions). > > A couple questions: How many floats will you be storing? When you test for membership, will you want to allow for a numerical tolerance, so that if the value 1 - 0.7 is added to the set, a test for the value 0.3 returns True? (0.3 is actually 0.29999999999999999, while 1-0.7 is 0.30000000000000004) Warren > Thanks for the help! > > Phillip > From aarchiba at physics.mcgill.ca Mon May 10 23:17:28 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Mon, 10 May 2010 23:17:28 -0400 Subject: [Numpy-discussion] efficient way to manage a set of floats? In-Reply-To: <28519085.post@talk.nabble.com> References: <28518014.post@talk.nabble.com> <28519085.post@talk.nabble.com> Message-ID: On 10 May 2010 21:56, Dr. Phillip M. Feldman wrote: > > > Anne Archibald-2 wrote: >> >> on a 32-bit machine, >> the space overhead is roughly a 32-bit object pointer or two for each >> float, plus about twice the number of floats times 32-bit pointers for >> the table. >> > > Hello Anne, > > I'm a bit confused by the above. ?It sounds as though the hash table > approach might occupy 4 times as much storage as a single array. ?Is that > right? Probably. Hash tables usually operate at about half-full for efficiency, so you'd have twice as many entries as you do objects. If you're using a python hash table, you also have type tagging and malloc information for each object. If you had a custom implementation, you'd have to have some way to mark hash cells as empty. In any case, expect at least a doubling of the space needed for a simple array. > Also, I don't understand why hashing would be useful for the set > application. The reason hash tables rely on hashing is not just to obtain a number for a potentially complex object; a good hash function should produce effectively random numbers for the user's objects. These numbers are reduced modulo the size of the hash table to determine where the object should go. If the user supplies a whole bunch of objects that all happen to hash to the same value, they'll all try to go into the same bin, and the hash table degenerates to an O(n) object as it has to search through the whole list each time. If you are using floating-point objects, well, for example the exponent may well be all the same, or take only a few values. If, after reduction modulo the table size, it's what gets used to determine where your numbers go, they'll all go in the same bin. Or you could be using all integers, which usually end with lots of zeros in floating-point representation, so that all your numbers go in exactly the same bin. You could try using non-power-of-two table sizes, but that just sort of hides the problem: someone someday will provide a collection of numbers, let's say a, a+b, a+2b, ... that reduce to the same value modulo your table size, and suddenly your hash table is agonizingly slow. There's kind of an art to designing good general-purpose hash functions; it's very handy that python provides one. > It seems as though a red-black tree might be a good implementation for a set > of floats if all that one wants to do is add, delete, and test membership. > (I will also need to do unions). If you're implementing it from scratch, you could go with a red-black tree, but a hash table is probably faster. I'd go with a simple hash table with linked-list buckets. Managing insertions and deletions should be only a minor pain compared to implementing a whole tree structure. You can probably find a nice hash function for floats with a bit of googling the literature. I should say, try just using python sets first, and only go into all this if they prove to be the slowest part of your program. > Thanks for the help! Good luck, Anne > Phillip > -- > View this message in context: http://old.nabble.com/efficient-way-to-manage-a-set-of-floats--tp28518014p28519085.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Mon May 10 23:37:20 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 10 May 2010 23:37:20 -0400 Subject: [Numpy-discussion] pareto docstring In-Reply-To: References: Message-ID: On Mon, May 10, 2010 at 2:14 PM, T J wrote: > On Sun, May 9, 2010 at 4:49 AM, ? wrote: >> >> I think this is the same point, I was trying to make last year. >> >> Instead of renormalizing, my conclusion was the following, >> (copied from the mailinglist August last year) >> >> """ >> my conclusion: >> --------------------- >> What numpy.random.pareto actually produces, are random numbers from a >> pareto distribution with lower bound m=1, but location parameter >> loc=-1, that shifts the distribution to the left. >> >> To actually get useful ?random numbers (that are correct in the usual >> usage http://en.wikipedia.org/wiki/Pareto_distribution), we need to >> add 1 to them. >> stats.distributions doesn't use mtrand.pareto >> >> rvs_pareto = 1 + numpy.random.pareto(a, size) >> >> """ >> >> I still have to work though the math of your argument, but maybe we >> can come to an agreement how the docstrings (or the function) should >> be changed, and what numpy.random.pareto really means. >> >> Josef >> (grateful, that there are another set of eyes on this) >> >> > > > Yes, I think my "renormalizing" statement is incorrect as it is really > just sampling from a different pdf altogether. ?See the following image: > > http://www.dumpt.com/img/viewer.php?file=q9tfk7ehxsw865vn067c.png > > It plots histograms of the various implementations against the pdfs. > Summarizing: > > The NumPy implementation is based on (Devroye p. 262). ?The pdf listed > there is: > > ? ?a / (1+x)^(a+1) > > This differs from the "standard" Pareto pdf: > > ? ?a / x^(a+1) > > It also differs from the pdf of the generalized Pareto distribution, > with scale=1 and location=0: > > ? ?(1 + a x)^(-1/a - 1) > > And it also differs from the pdf of the generalized Pareto > distribution with scale=1 and location=-1 ?or location=1. > > random.paretovariate and scipy.stats.pareto sample from the standard > Pareto, and this is the desired behavior, IMO. ?Its true that "1 + > np.random.pareto" provides the fix, but I think we're better off > changing the underlying implementation. ?Devroye has a more recent > paper: > > ?http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.85.8760 > > which states the Pareto distribution in the standard way. ?So I think > it is safe to make this change. ?Backwards compatibility might be the > only argument for not making this change. ?So here is my proposal: > > ?1) Remove every mention of the generalized Pareto distribution from > the docstring. ?As far as I can see, the generalized Pareto > distribution does not reduce to the "standard" Pareto at all. ?We can > still mention scipy.stats.distributions.genpareto and > scipy.stats.distributions.pareto. ?The former is something different > and the latter will (now) be equivalent to the NumPy function. > > ?2) Modify numpy/random/mtrand/distributions.c in the following way: > > double rk_pareto(rk_state *state, double a) > { > ? //return exp(rk_standard_exponential(state)/a) - 1; > ? return 1.0 / rk_double(state)**(1.0 / a); > } > > Does this sound good? I went googling and found a new interpretation numpy.random.pareto is actually the Lomax distribution also known as Pareto 2, Pareto (II) or Pareto Second Kind distribution http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/pa2pdf.htm http://www.mathwave.com/articles/pareto_2_lomax_distribution.html http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/library/VGAM/html/lomax.html which is different from the Pareto (First Kind) distribution http://www.mathwave.com/help/easyfit/html/analyses/distributions/pareto.html http://en.wikipedia.org/wiki/Pareto_distribution#Density_function The R-VGAM docs have the reference to http://books.google.ca/books?id=pGsAZ3W7uEAC&pg=PA226&lpg=PA226&dq=Kleiber,+C.+and+Kotz,+S.+%282003%29+Statistical+Size+Distributions+in+Economics+and+Actuarial+Sciences+pareto+lomax&source=bl&ots=j1-AoRxm5E&sig=KDrWehJW5kt-EKH1VjDe-lFMRpw&hl=en&ei=csPoS_bgDYT58Aau46z2Cg&sa=X&oi=book_result&ct=result&resnum=1&ved=0CBUQ6AEwAA#v=onepage&q&f=false which states on page 227 X is distributed as Lomax(b,q) <=> X + b is distributed Pareto(b, q) where b is the scale parameter b=1 in numpy.random.pareto notation and q is the shape parameter alpha quote: "... the Pareto (II) distribution - after all, it's just a shifted classical Pareto distribution - ..." So, from this it looks like numpy.random does not have a Pareto distribution, only Lomax, and the confusion came maybe because somewhere in the history the (II) (second kind) got dropped in the explanations. and actually it is in scipy.stats.distributions, but without rvs # LOMAX (Pareto of the second kind.) # Special case of Pareto of the first kind (location=-1.0) class lomax_gen(rv_continuous): def _pdf(self, x, c): return c*1.0/(1.0+x)**(c+1.0) def _cdf(self, x, c): return 1.0-1.0/(1.0+x)**c def _ppf(self, q, c): return pow(1.0-q,-1.0/c)-1 def _stats(self, c): mu, mu2, g1, g2 = pareto.stats(c, loc=-1.0, moments='mvsk') return mu, mu2, g1, g2 def _entropy(self, c): return 1+1.0/c-log(c) lomax = lomax_gen(a=0.0, name="lomax", longname="A Lomax (Pareto of the second kind)", shapes="c", extradoc=""" Lomax (Pareto of the second kind) distribution lomax.pdf(x,c) = c / (1+x)**(c+1) for x >= 0, c > 0. """ There are too many distribution, and too many different names. >So here is my proposal: > > 1) Remove every mention of the generalized Pareto distribution from > the docstring. As far as I can see, the generalized Pareto > distribution does not reduce to the "standard" Pareto at all. We can > still mention scipy.stats.distributions.genpareto and > scipy.stats.distributions.pareto. The former is something different > and the latter will (now) be equivalent to the NumPy function. agreed, although I haven't figured out yet why Pareto and generalized Pareto have similar names > > 2) Modify numpy/random/mtrand/distributions.c in the following way: > > double rk_pareto(rk_state *state, double a) > { > //return exp(rk_standard_exponential(state)/a) - 1; > return 1.0 / rk_double(state)**(1.0 / a); > } I'm not an expert on random number generator, but using the uniform distribution as in http://en.wikipedia.org/wiki/Pareto_distribution#Generating_a_random_sample_from_Pareto_distribution and your Devroy reference seems better, than based on the relationship to the exponential distribution http://en.wikipedia.org/wiki/Pareto_distribution#Relation_to_the_exponential_distribution I think without changing the source we can rewrite the docstring that this is Lomax (or Pareto of the Second Kind), so that at least the documentation is less misleading. But I find calling it Pareto very confusing, and I'm not the only one anymore, (and I don't know if anyone has used it assuming it is classical Pareto), so my preferred solution would be * rename numpy.random.pareto to numpy.random.lomax * and create a real (classical, first kind) pareto distribution (even though it's just adding or subtracting 1, ones we know it) (and I'm adding the _rvs to scipy.stats.lomax) What's the backwards compatibility policy with very confusing names in numpy? Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From tjhnson at gmail.com Tue May 11 03:23:52 2010 From: tjhnson at gmail.com (T J) Date: Tue, 11 May 2010 00:23:52 -0700 Subject: [Numpy-discussion] pareto docstring In-Reply-To: References: Message-ID: On Mon, May 10, 2010 at 8:37 PM, wrote: > > I went googling and found a new interpretation > > numpy.random.pareto is actually the Lomax distribution also known as Pareto 2, > Pareto (II) or Pareto Second Kind distribution > Great! > > So, from this it looks like numpy.random does not have a Pareto > distribution, only Lomax, and the confusion came maybe because > somewhere in the history the (II) (second kind) got dropped in the > explanations. > > and actually it is in scipy.stats.distributions, but without rvs > > # LOMAX (Pareto of the second kind.) > # ?Special case of Pareto of the first kind (location=-1.0) > I understand the point with this last comment, but I think it can be confusing in that the Pareto (of the first kind) has no "location" parameter and people might think you are referring to the Generalized Pareto distribution. I think its much clearer to say: # Special case of the Pareto of the first kind, but shifted to the left by 1. x --> x + 1 > >> >> ?2) Modify numpy/random/mtrand/distributions.c in the following way: >> >> double rk_pareto(rk_state *state, double a) >> { >> ? //return exp(rk_standard_exponential(state)/a) - 1; >> ? return 1.0 / rk_double(state)**(1.0 / a); >> } > > I'm not an expert on random number generator, but using the uniform distribution > as in > http://en.wikipedia.org/wiki/Pareto_distribution#Generating_a_random_sample_from_Pareto_distribution > and your Devroy reference seems better, than based on the relationship to > the exponential distribution > http://en.wikipedia.org/wiki/Pareto_distribution#Relation_to_the_exponential_distribution > > Correct. The exp relationship was for the existing implementation (which corresponds to the Lomax). I commented that line out and just used 1/U^(1/a). > I think without changing the source we can rewrite the docstring that > this is Lomax (or > Pareto of the Second Kind), so that at least the documentation is less > misleading. > > But I find calling it Pareto very confusing, and I'm not the only one anymore, > (and I don't know if anyone has used it assuming it is classical Pareto), > so my preferred solution would be > > * rename numpy.random.pareto to numpy.random.lomax > * and create a real (classical, first kind) pareto distribution (even > though it's just > ?adding or subtracting 1, ones we know it) > I personally have used numpy.random.pareto thinking it was the Pareto distribution of the first kind---which led to this post in the first place. So, I'm in strong agreement. While doing this, perhaps we should increase functionality and allow users the ability to specify the scale of the distribution (instead of just the shape)? I can make a ticket for this and give a stab at creating the necessary patch. > > What's the backwards compatibility policy with very confusing names in numpy? > It seems reasonable that we might have to follow the deprecation route, but I'd be happier with a "faster" fix. 1.5 - Provide numpy.random.lomax. Make numpy.random.pareto raise a DeprecationWarning and then call lomax. 2.0 (if there is no 1.6) - Make numpy.random.pareto behave as Pareto distribution of 1st kind. Immediately though, we can modify the docstring that is currently in there to make the situation clear, instructing users how they can generate samples from the "standard" Pareto distribution. This is the first patch I'll submit. Perhaps it is better to only change the docstring and then save all changes in functionality for 2.0. Deferring to others on this one... From d.l.goldsmith at gmail.com Tue May 11 03:48:04 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 11 May 2010 00:48:04 -0700 Subject: [Numpy-discussion] pareto docstring In-Reply-To: References: Message-ID: On Tue, May 11, 2010 at 12:23 AM, T J wrote: > On Mon, May 10, 2010 at 8:37 PM, wrote: > > > > I went googling and found a new interpretation > > > > numpy.random.pareto is actually the Lomax distribution also known as > Pareto 2, > > Pareto (II) or Pareto Second Kind distribution > > > > Great! > > > > > So, from this it looks like numpy.random does not have a Pareto > > distribution, only Lomax, and the confusion came maybe because > > somewhere in the history the (II) (second kind) got dropped in the > > explanations. > > > > and actually it is in scipy.stats.distributions, but without rvs > > > > # LOMAX (Pareto of the second kind.) > > # Special case of Pareto of the first kind (location=-1.0) > > > > I understand the point with this last comment, but I think it can be > confusing in that the Pareto (of the first kind) has no "location" > parameter and people might think you are referring to the Generalized > Pareto distribution. I think its much clearer to say: > > # Special case of the Pareto of the first kind, but shifted to the > left by 1. x --> x + 1 > > > > >> > >> 2) Modify numpy/random/mtrand/distributions.c in the following way: > >> > >> double rk_pareto(rk_state *state, double a) > >> { > >> //return exp(rk_standard_exponential(state)/a) - 1; > >> return 1.0 / rk_double(state)**(1.0 / a); > >> } > > > > I'm not an expert on random number generator, but using the uniform > distribution > > as in > > > http://en.wikipedia.org/wiki/Pareto_distribution#Generating_a_random_sample_from_Pareto_distribution > > and your Devroy reference seems better, than based on the relationship to > > the exponential distribution > > > http://en.wikipedia.org/wiki/Pareto_distribution#Relation_to_the_exponential_distribution > > > > > > Correct. The exp relationship was for the existing implementation > (which corresponds to the Lomax). I commented that line out and just > used 1/U^(1/a). > > > > I think without changing the source we can rewrite the docstring that > > this is Lomax (or > > Pareto of the Second Kind), so that at least the documentation is less > > misleading. > > > > But I find calling it Pareto very confusing, and I'm not the only one > anymore, > > (and I don't know if anyone has used it assuming it is classical Pareto), > > so my preferred solution would be > > > > * rename numpy.random.pareto to numpy.random.lomax > > * and create a real (classical, first kind) pareto distribution (even > > though it's just > > adding or subtracting 1, ones we know it) > > > > I personally have used numpy.random.pareto thinking it was the Pareto > distribution of the first kind---which led to this post in the first > place. So, I'm in strong agreement. While doing this, perhaps we > should increase functionality and allow users the ability to specify > the scale of the distribution (instead of just the shape)? > > I can make a ticket for this and give a stab at creating the necessary > patch. > > > > > > What's the backwards compatibility policy with very confusing names in > numpy? > > > > It seems reasonable that we might have to follow the deprecation > route, but I'd be happier with a "faster" fix. > > 1.5 > - Provide numpy.random.lomax. Make numpy.random.pareto raise a > DeprecationWarning and then call lomax. > 2.0 (if there is no 1.6) > - Make numpy.random.pareto behave as Pareto distribution of 1st kind. > > Immediately though, we can modify the docstring that is currently in > there to make the situation clear, instructing users how they can > generate samples from the "standard" Pareto distribution. This is the > first patch I'll submit. Perhaps it is better to only change the > docstring and then save all changes in functionality for 2.0. > Deferring to others on this one... > Elsewhere in the mailing list, it has been stated that our "policy" is to document desired/intended behavior, when such differs from actual (current) behavior. This can be done in advance of a code fix to implement the desired behavior, but we have discouraged (to the point of saying "don't do it") documenting current behavior when it is known that this should (and presumably will) be changed. DG > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue May 11 04:14:01 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 11 May 2010 08:14:01 +0000 (UTC) Subject: [Numpy-discussion] pareto docstring References: Message-ID: Tue, 11 May 2010 00:23:52 -0700, T J wrote: [clip] > It seems reasonable that we might have to follow the deprecation route, > but I'd be happier with a "faster" fix. > > 1.5 > - Provide numpy.random.lomax. Make numpy.random.pareto raise a > DeprecationWarning and then call lomax. > > 2.0 (if there is no 1.6) > - Make numpy.random.pareto behave as Pareto distribution of 1st kind. I think the next Numpy release will be 2.0. How things were done with changes in the histogram function were: 1) Add a "new=False" keyword argument, and raise a DeprecationWarning if new==False. The user then must call it with "pareto(..., new=True)" to get the correct behaviour. 2) In the next release, change the default to "new=True". Another option would be to add a correct implementation with a different name, e.g. `pareto1` to signal it's the first kind, and deprecate the old function altogether. A third option would be just to silently fix the bug. In any case the change should be mentioned noticeably in the release notes. -- Pauli Virtanen From bioinformed at gmail.com Tue May 11 08:11:51 2010 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Tue, 11 May 2010 08:11:51 -0400 Subject: [Numpy-discussion] pareto docstring In-Reply-To: References: Message-ID: On Tue, May 11, 2010 at 4:14 AM, Pauli Virtanen wrote: > A third option would be just to silently fix the bug. In any case the > change should be mentioned noticeably in the release notes. > > I see this as two bugs: the Lomax distribution was named incorrectly and the Parato distribution was incorrect or confusingly labeled. Both should be fixed and clearly documented. Unlike cases of changing tastes and preferences, it seems unduly complicated and confusing to perseverate with backward compatibility shims. The next release is NumPy 2.0, which will have other known and well advertised API and ABI incompatibilities. Just my 2e-10 cents, -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue May 11 10:42:23 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 11 May 2010 10:42:23 -0400 Subject: [Numpy-discussion] pareto docstring In-Reply-To: References: Message-ID: On Tue, May 11, 2010 at 8:11 AM, Kevin Jacobs wrote: > On Tue, May 11, 2010 at 4:14 AM, Pauli Virtanen wrote: >> >> A third option would be just to silently fix the bug. In any case the >> change should be mentioned noticeably in the release notes. >> > > I see this as two bugs: the Lomax distribution was named incorrectly and the > Parato distribution was incorrect or confusingly labeled. ?Both should be > fixed and clearly documented. ?Unlike cases of changing tastes and > preferences, it seems unduly complicated and confusing to?perseverate?with > backward?compatibility?shims. ?The next release is NumPy 2.0, which will > have other known and well?advertised?API and ABI incompatibilities. > Just my 2e-10 cents, > -Kevin I would have also considered it as a bug fix, except that there might be users who use the correction (+1) as a workaround. In that case, just changing the behavior without raising an exception for the current usage will introduce hard to find bugs. (It's difficult to see whether the random numbers are correct or as expected without proper testing.) For example, we use the work-around in the docstring of http://docs.scipy.org/numpy/docs/numpy.random.mtrand.RandomState.power/ and actually, reading the numpy.random.pareto docstring again more carefully, the example does the correction also:: Draw samples from the distribution: >>> a, m = 3., 1. # shape and mode >>> s = np.random.pareto(a, 1000) + m But it's very confusing, also there is a relationship between Pareto/Lomax and GPD, but I'm not sure yet my algebra is correct. (and I have misplaced the graphs and tables with the relationships between different distributions) To minimize backwards compatibility problems we could attach a *big* warning text to pareto ("use at your own risk") and create new random variates, as Pauli proposed pareto1 - classical pareto pareto2 or lomax - with random variates the same as current pareto both could then get clear, unambiguous descriptions. and a note to using the uniform distribution for the generation of random numbers. python 2.5 random.py uses the half open uniform distribution to avoid division by zero, I don't know how numpy.random handles boundary values def paretovariate(self, alpha): """Pareto distribution. alpha is the shape parameter.""" # Jain, pg. 495 u = 1.0 - self.random() return 1.0 / pow(u, 1.0/alpha) Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From josef.pktd at gmail.com Tue May 11 11:50:39 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 11 May 2010 11:50:39 -0400 Subject: [Numpy-discussion] pareto docstring In-Reply-To: References: Message-ID: Assuming no typos Relationship between Pareto, Pareto(II)/Lomax and Generalized Pareto,GPD >>> import sympy as sy >>> x = sy.Symbol('x') >>> k = sy.Symbol('k') >>> c = sy.Symbol('c') >>> a = sy.Symbol('a') >>> m = sy.Symbol('m') >>> mgpd = sy.Symbol('mgpd') >>> gpd0 = (1 - c*x/k)**(1/c - 1)/k #JKB notation (c reversed sign) >>> gpd = 1/k/(1 + c*(x-mgpd)/k)**(1/c + 1) #similar to Wikipedia >>> par = a*k**a/(x-m)**(1+a) #JKB >>> lom = a/k/(1+x/k)**(1+a) >>> lom.subs(k,1) #Pareto(II), Lomax (loc=0, scale=1) a*(1 + x)**(-1 - a) >>> par.subs(k,1).subs(m,-1) #Pareto with loc=-1, scale=1 a*(1 + x)**(-1 - a) >>> gpd.subs(c,1/a).subs(k,1/a).subs(mgpd,0) #GPD with loc=0, scale=1/a a*(1 + x)**(-1 - a) >>> par.subs(k,1).subs(m,0) # standard Pareto (loc=0, scale=1) a*x**(-1 - a) >>> gpd.subs(c,1/a).subs(k,1/a).subs(mgpd,1) #GPD with loc=1, scale=1/a a*x**(-1 - a) Josef From jkington at wisc.edu Tue May 11 12:15:56 2010 From: jkington at wisc.edu (Joe Kington) Date: Tue, 11 May 2010 11:15:56 -0500 Subject: [Numpy-discussion] Downcasting an array in-place? Message-ID: Is it possible to downcast an array in-place? For example: x = np.random.random(10) # Placeholder for "real" data x -= x.min() x /= x.ptp() / 255 x = x.astype(np.uint8) <-- returns a copy First off, a bit of background to the question... At the moment, I'm trying to downcast a large (>10GB) array of uint16's to uint8's. I have enough RAM to fit everything into memory, but I'd really prefer to use as few copies as possible.... In the particular case of a C-ordered uint16 array to uint8 on a little-endian system, I can do this: # "x" is the big 3D array of uint16's x -= x.min() x /= x.ptp() / 255 x = x.view(np.uint8)[:, :, ::2] That works, but a) produces a non-contiguous array, and b) is awfully case-specific. Is there a way to do something similar to astype(), but have it "cannibalize" the memory of the original array? (e.g. the "out" argument in a ufunc?) Hopefully my question makes some sense to someone other than myself... Thanks! -Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.gregory at oregonstate.edu Wed May 12 18:37:17 2010 From: matt.gregory at oregonstate.edu (Gregory, Matthew) Date: Wed, 12 May 2010 15:37:17 -0700 Subject: [Numpy-discussion] newbie: convert recarray to floating-point ndarray with mixed types Message-ID: <1D673F86DDA00841A1216F04D1CE70D64183757513@EXCH2.nws.oregonstate.edu> Apologies for what is likely a simple question and I hope it hasn't been asked before ... Given a recarray with a dtype consisting of more than one type, e.g. >>> import numpy as n >>> a = n.array([(1.0, 2), (3.0, 4)], dtype=[('x', float), ('y', int)]) >>> b = a.view(n.recarray) >>> b rec.array([(1.0, 2), (3.0, 4)], dtype=[('x', '>> c = b.view(dtype='float').reshape(b.size,-1) but that fails with: ValueError: new type not compatible with array. I understand why this would fail (as it is a view and not a copy), but I'm lost on a method to do this conversion simply. thanks, matt From pfeldman at verizon.net Wed May 12 20:09:27 2010 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Wed, 12 May 2010 17:09:27 -0700 (PDT) Subject: [Numpy-discussion] efficient way to manage a set of floats? In-Reply-To: <4BE8C48B.4090209@enthought.com> References: <28518014.post@talk.nabble.com> <28519085.post@talk.nabble.com> <4BE8C48B.4090209@enthought.com> Message-ID: <28542439.post@talk.nabble.com> Warren Weckesser-3 wrote: > > A couple questions: > > How many floats will you be storing? > > When you test for membership, will you want to allow for a numerical > tolerance, so that if the value 1 - 0.7 is added to the set, a test for > the value 0.3 returns True? (0.3 is actually 0.29999999999999999, while > 1-0.7 is 0.30000000000000004) > > Warren > Anne- Thanks for that absolutely beautiful explanation!! Warren- I had not initially thought about numerical tolerance, but this could potentially be an issue, in which case the management of the data would have to be completely different. Thanks for pointing this out! I might have as many as 50,000 values. Phillip -- View this message in context: http://old.nabble.com/efficient-way-to-manage-a-set-of-floats--tp28518014p28542439.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pfeldman at verizon.net Wed May 12 20:19:12 2010 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Wed, 12 May 2010 17:19:12 -0700 (PDT) Subject: [Numpy-discussion] default behavior of argsort Message-ID: <28542476.post@talk.nabble.com> When operating on an array whose last dimension is unity, the default behavior of argsort is not very useful: |6> x=random.random((4,1)) |7> shape(x) <7> (4, 1) |8> argsort(x) <8> array([[0], [0], [0], [0]]) |9> argsort(x,axis=0) <9> array([[0], [2], [1], [3]]) -- View this message in context: http://old.nabble.com/default-behavior-of-argsort-tp28542476p28542476.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From aarchiba at physics.mcgill.ca Wed May 12 21:22:37 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Wed, 12 May 2010 21:22:37 -0400 Subject: [Numpy-discussion] efficient way to manage a set of floats? In-Reply-To: <28542439.post@talk.nabble.com> References: <28518014.post@talk.nabble.com> <28519085.post@talk.nabble.com> <4BE8C48B.4090209@enthought.com> <28542439.post@talk.nabble.com> Message-ID: On 12 May 2010 20:09, Dr. Phillip M. Feldman wrote: > > > Warren Weckesser-3 wrote: >> >> A couple questions: >> >> How many floats will you be storing? >> >> When you test for membership, will you want to allow for a numerical >> tolerance, so that if the value 1 - 0.7 is added to the set, a test for >> the value 0.3 returns True? ?(0.3 is actually 0.29999999999999999, while >> 1-0.7 is 0.30000000000000004) >> >> Warren >> > > Anne- Thanks for that absolutely beautiful explanation!! > > Warren- I had not initially thought about numerical tolerance, but this > could potentially be an issue, in which case the management of the data > would have to be completely different. ?Thanks for pointing this out! ?I > might have as many as 50,000 values. If you want one-dimensional "sets" with numerical tolerances, then either a sorted-array implementation looks more appealing. A sorted-tree implementation will be a little awkward, since you will often need to explore two branches to find out the nearest neighbour of a query point. In fact what you have is a one-dimensional kd-tree, which is helpfully provided by scipy.spatial, albeit without insertion or deletion operators. I should also point out that when you start wanting approximate matches, which you will as soon as you do any sort of arithmetic on your floats, your idea of a "set" becomes extremely messy. For example, suppose you try to insert a float that's one part in a million different from one that's in the table. Does it get inserted too or is it "equal" to what's there? When it comes time to remove it, your query will probably have a value slightly different from either previous value - which one, or both, do you remove? Or do you raise an exception? Resolving these questions satisfactorily will probably require you to know the scales that are relevant in your problem and implement sensible handling of scales larger or smaller than this (but beware of the "teapot in a stadium problem", of wildly different scales in the same data set). Even so, you will want to write algorithms that are robust to imprecision, duplication, and disappearance of points in your sets. (If this sounds like the voice of bitter experience, well, I discovered while writing a commercial ray-tracer that when you shoot billions of rays into millions of triangles, all sorts of astonishing limitations of floating-point turn into graphical artifacts. Which are always *highly* visible. It was during this period that the interval-arithmetic camp nearly gained a convert.) Anne > Phillip > -- > View this message in context: http://old.nabble.com/efficient-way-to-manage-a-set-of-floats--tp28518014p28542439.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Wed May 12 21:27:16 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 May 2010 21:27:16 -0400 Subject: [Numpy-discussion] efficient way to manage a set of floats? In-Reply-To: <28542439.post@talk.nabble.com> References: <28518014.post@talk.nabble.com> <28519085.post@talk.nabble.com> <4BE8C48B.4090209@enthought.com> <28542439.post@talk.nabble.com> Message-ID: On Wed, May 12, 2010 at 20:09, Dr. Phillip M. Feldman wrote: > > Warren Weckesser-3 wrote: >> >> A couple questions: >> >> How many floats will you be storing? >> >> When you test for membership, will you want to allow for a numerical >> tolerance, so that if the value 1 - 0.7 is added to the set, a test for >> the value 0.3 returns True? ?(0.3 is actually 0.29999999999999999, while >> 1-0.7 is 0.30000000000000004) >> >> Warren >> > > Anne- Thanks for that absolutely beautiful explanation!! > > Warren- I had not initially thought about numerical tolerance, but this > could potentially be an issue, in which case the management of the data > would have to be completely different. ?Thanks for pointing this out! ?I > might have as many as 50,000 values. You may want to explain your higher-level problem. Maintaining sets of floating point numbers is almost never the right approach. With sets, comparison must necessarily be by exact equality because fuzzy equality is not transitive. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Wed May 12 21:37:41 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 12 May 2010 21:37:41 -0400 Subject: [Numpy-discussion] efficient way to manage a set of floats? In-Reply-To: References: <28518014.post@talk.nabble.com> <28519085.post@talk.nabble.com> <4BE8C48B.4090209@enthought.com> <28542439.post@talk.nabble.com> Message-ID: On Wed, May 12, 2010 at 9:27 PM, Robert Kern wrote: > On Wed, May 12, 2010 at 20:09, Dr. Phillip M. Feldman > wrote: >> >> Warren Weckesser-3 wrote: >>> >>> A couple questions: >>> >>> How many floats will you be storing? >>> >>> When you test for membership, will you want to allow for a numerical >>> tolerance, so that if the value 1 - 0.7 is added to the set, a test for >>> the value 0.3 returns True? ?(0.3 is actually 0.29999999999999999, while >>> 1-0.7 is 0.30000000000000004) >>> >>> Warren >>> >> >> Anne- Thanks for that absolutely beautiful explanation!! >> >> Warren- I had not initially thought about numerical tolerance, but this >> could potentially be an issue, in which case the management of the data >> would have to be completely different. ?Thanks for pointing this out! ?I >> might have as many as 50,000 values. > > You may want to explain your higher-level problem. Maintaining sets of > floating point numbers is almost never the right approach. With sets, > comparison must necessarily be by exact equality because fuzzy > equality is not transitive. with consistent scaling, shouldn't something like rounding to a fixed precision be enough? >>> round(1 - 0.7,14) == round(0.3, 14) True >>> 1 - 0.7 == 0.3 False or approx_equal instead of almost_equal Josef > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ben.root at ou.edu Wed May 12 22:06:50 2010 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 12 May 2010 21:06:50 -0500 Subject: [Numpy-discussion] efficient way to manage a set of floats? In-Reply-To: References: <28518014.post@talk.nabble.com> <28519085.post@talk.nabble.com> <4BE8C48B.4090209@enthought.com> <28542439.post@talk.nabble.com> Message-ID: On Wed, May 12, 2010 at 8:37 PM, wrote: > On Wed, May 12, 2010 at 9:27 PM, Robert Kern > wrote: > > On Wed, May 12, 2010 at 20:09, Dr. Phillip M. Feldman > > wrote: > >> > >> Warren Weckesser-3 wrote: > >>> > >>> A couple questions: > >>> > >>> How many floats will you be storing? > >>> > >>> When you test for membership, will you want to allow for a numerical > >>> tolerance, so that if the value 1 - 0.7 is added to the set, a test for > >>> the value 0.3 returns True? (0.3 is actually 0.29999999999999999, > while > >>> 1-0.7 is 0.30000000000000004) > >>> > >>> Warren > >>> > >> > >> Anne- Thanks for that absolutely beautiful explanation!! > >> > >> Warren- I had not initially thought about numerical tolerance, but this > >> could potentially be an issue, in which case the management of the data > >> would have to be completely different. Thanks for pointing this out! I > >> might have as many as 50,000 values. > > > > You may want to explain your higher-level problem. Maintaining sets of > > floating point numbers is almost never the right approach. With sets, > > comparison must necessarily be by exact equality because fuzzy > > equality is not transitive. > > with consistent scaling, shouldn't something like rounding to a fixed > precision be enough? > > >>> round(1 - 0.7,14) == round(0.3, 14) > True > >>> 1 - 0.7 == 0.3 > False > > or approx_equal instead of almost_equal > > Josef > > I have to agree with Robert. Whenever a fellow student comes to me describing an issue where they needed to find a floating point number in an array, the problem can usually be restated in a way that makes much more sense. There are so many issues with doing a naive comparison using round() (largely because it is intransitive as someone else already stated). As a quick and dirty solution to very specific issues, they work -- but they are almost never left as a final solution. Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed May 12 22:13:59 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 May 2010 22:13:59 -0400 Subject: [Numpy-discussion] efficient way to manage a set of floats? In-Reply-To: References: <28518014.post@talk.nabble.com> <28519085.post@talk.nabble.com> <4BE8C48B.4090209@enthought.com> <28542439.post@talk.nabble.com> Message-ID: On Wed, May 12, 2010 at 21:37, wrote: > On Wed, May 12, 2010 at 9:27 PM, Robert Kern wrote: >> On Wed, May 12, 2010 at 20:09, Dr. Phillip M. Feldman >> wrote: >>> >>> Warren Weckesser-3 wrote: >>>> >>>> A couple questions: >>>> >>>> How many floats will you be storing? >>>> >>>> When you test for membership, will you want to allow for a numerical >>>> tolerance, so that if the value 1 - 0.7 is added to the set, a test for >>>> the value 0.3 returns True? ?(0.3 is actually 0.29999999999999999, while >>>> 1-0.7 is 0.30000000000000004) >>>> >>>> Warren >>>> >>> >>> Anne- Thanks for that absolutely beautiful explanation!! >>> >>> Warren- I had not initially thought about numerical tolerance, but this >>> could potentially be an issue, in which case the management of the data >>> would have to be completely different. ?Thanks for pointing this out! ?I >>> might have as many as 50,000 values. >> >> You may want to explain your higher-level problem. Maintaining sets of >> floating point numbers is almost never the right approach. With sets, >> comparison must necessarily be by exact equality because fuzzy >> equality is not transitive. > > with consistent scaling, shouldn't something like rounding to a fixed > precision be enough? Then you might was well convert to integers and do integer sets. The problem is that two floats very close to a border (and hence each other) would end up in rounding to different bins. They will compare unequal to each other and equal to numbers farther away but in the same arbitrary bin. Again, it depends on the higher-level problem. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sccolbert at gmail.com Wed May 12 23:06:00 2010 From: sccolbert at gmail.com (Chris Colbert) Date: Wed, 12 May 2010 23:06:00 -0400 Subject: [Numpy-discussion] Bug with how numpy.distutils.system_info handles the site.cfg Message-ID: I had this problem back in 2009 when building Enthought Enable, and was happy with a work around. It just bit me again, and I finally got around to drilling down to the problem. On linux, if one uses the numpy/site.cfg [default] section when building from source to specifiy local library directories, the x11 libs won't be found by NumPy. The relevant section of the site.cfg.example reads as follows: # Defaults # ======== # The settings given here will apply to all other sections if not overridden. # This is a good place to add general library and include directories like # /usr/local/{lib,include} # #[DEFAULT] #library_dirs = /usr/local/lib #include_dirs = /usr/local/include Now, I build NumPy with Atlas and my Atlas libs are installed in /usr/local, so my [default] section of site.cfg looks like this (as suggested by the site.cfg.example): # Defaults # ======== # The settings given here will apply to all other sections if not overridden. # This is a good place to add general library and include directories like # /usr/local/{lib,include} # [DEFAULT] library_dirs = /usr/local/lib:/usr/local/lib/atlas include_dirs = /usr/local/include NumPy builds and works fine with this. The problem occurs when other libraries use numpy.distutils.system_info.get_info('x11') (ala Enthought Enable). That function eventually calls numpy.distutils.system_info.system_info.parse_config_files which has the following definition: def parse_config_files(self): self.cp.read(self.files) if not self.cp.has_section(self.section): if self.section is not None: self.cp.add_section(self.section) When self.cp is instantiated (when looking for the x11 libs), it is provided the following defaults: {'libraries': '', 'src_dirs': '.:/usr/local/src', 'search_static_first': '0', 'library_dirs': '/usr/X11R6/lib64:/usr/X11R6/lib:/usr/X11/lib64:/usr/X11/lib:/usr/lib64:/usr/lib', 'include_dirs': '/usr/X11R6/include:/usr/X11/include:/usr/include'} As is clearly seen, the 'library_dirs' contains the proper paths to find the x11 libs. But since the config file has [default] section, these paths get trampled and replaced with whatever is contained in the site.cfg [default] section. In my case, this is /usr/local/lib:/usr/local/lib/atlas. Thus, my x11 libs aren't found and the Enable build fails. The workaround is to include an [x11] section in site.cfg with the appropriate paths, but I don't really feel this should be necessary. Would the better behavior be to look for a [default] section in the config file in the parse_config_files method and add those paths to the already specified defaults? Changing the site.cfg [default] section to read as follows: [DEFAULT] library_dirs = /usr/lib:/usr/local/lib:/usr/local/lib/atlas include_dirs = /usr/include:/usr/local/include is not an option because then NumPy will find and use the system atlas, which in my case is not threaded nor optimized for my machine. If you want me to patch the parse_config_files method, just let me know. Cheers, Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Wed May 12 23:15:18 2010 From: sccolbert at gmail.com (Chris Colbert) Date: Wed, 12 May 2010 23:15:18 -0400 Subject: [Numpy-discussion] Bug with how numpy.distutils.system_info handles the site.cfg In-Reply-To: References: Message-ID: On Wed, May 12, 2010 at 11:06 PM, Chris Colbert wrote: > I had this problem back in 2009 when building Enthought Enable, and was > happy with a work around. It just bit me again, and I finally got around to > drilling down to the problem. > > On linux, if one uses the numpy/site.cfg [default] section when building > from source to specifiy local library directories, the x11 libs won't be > found by NumPy. > > The relevant section of the site.cfg.example reads as follows: > > # Defaults > # ======== > # The settings given here will apply to all other sections if not > overridden. > # This is a good place to add general library and include directories like > # /usr/local/{lib,include} > # > #[DEFAULT] > #library_dirs = /usr/local/lib > #include_dirs = /usr/local/include > > Now, I build NumPy with Atlas and my Atlas libs are installed in > /usr/local, so my [default] section of site.cfg looks like this (as > suggested by the site.cfg.example): > > # Defaults > # ======== > # The settings given here will apply to all other sections if not > overridden. > # This is a good place to add general library and include directories like > # /usr/local/{lib,include} > # > [DEFAULT] > library_dirs = /usr/local/lib:/usr/local/lib/atlas > include_dirs = /usr/local/include > > > NumPy builds and works fine with this. The problem occurs when other > libraries use numpy.distutils.system_info.get_info('x11') (ala Enthought > Enable). That function eventually calls > numpy.distutils.system_info.system_info.parse_config_files which has the > following definition: > > def parse_config_files(self): > self.cp.read(self.files) > if not self.cp.has_section(self.section): > if self.section is not None: > self.cp.add_section(self.section) > > When self.cp is instantiated (when looking for the x11 libs), it is > provided the following defaults: > > {'libraries': '', 'src_dirs': '.:/usr/local/src', 'search_static_first': > '0', 'library_dirs': > '/usr/X11R6/lib64:/usr/X11R6/lib:/usr/X11/lib64:/usr/X11/lib:/usr/lib64:/usr/lib', > 'include_dirs': '/usr/X11R6/include:/usr/X11/include:/usr/include'} > > As is clearly seen, the 'library_dirs' contains the proper paths to find > the x11 libs. But since the config file has [default] section, these paths > get trampled and replaced with whatever is contained in the site.cfg > [default] section. In my case, this is /usr/local/lib:/usr/local/lib/atlas. > Thus, my x11 libs aren't found and the Enable build fails. > > The workaround is to include an [x11] section in site.cfg with the > appropriate paths, but I don't really feel this should be necessary. Would > the better behavior be to look for a [default] section in the config file in > the parse_config_files method and add those paths to the already specified > defaults? > > Then again, another workaround could be to add the atlas directory paths to the [blas_opt] and [lapack_opt] sections. This would work for my case, but it doesn't solve the larger problem of any directories put in [default] trouncing any of the other standard dirs that would otherwise be used. > Changing the site.cfg [default] section to read as follows: > > [DEFAULT] > library_dirs = /usr/lib:/usr/local/lib:/usr/local/lib/atlas > include_dirs = /usr/include:/usr/local/include > > is not an option because then NumPy will find and use the system atlas, > which in my case is not threaded nor optimized for my machine. > > If you want me to patch the parse_config_files method, just let me know. > > Cheers, > > Chris > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Thu May 13 01:40:00 2010 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 12 May 2010 19:40:00 -1000 Subject: [Numpy-discussion] newbie: convert recarray to floating-point ndarray with mixed types In-Reply-To: <1D673F86DDA00841A1216F04D1CE70D64183757513@EXCH2.nws.oregonstate.edu> References: <1D673F86DDA00841A1216F04D1CE70D64183757513@EXCH2.nws.oregonstate.edu> Message-ID: <4BEB90B0.1080909@hawaii.edu> On 05/12/2010 12:37 PM, Gregory, Matthew wrote: > Apologies for what is likely a simple question and I hope it hasn't been asked before ... > > Given a recarray with a dtype consisting of more than one type, e.g. > > >>> import numpy as n > >>> a = n.array([(1.0, 2), (3.0, 4)], dtype=[('x', float), ('y', int)]) > >>> b = a.view(n.recarray) > >>> b > rec.array([(1.0, 2), (3.0, 4)], > dtype=[('x', ' > Is there a simple way to convert 'b' to a floating-point ndarray, casting the integer field to a floating-point? I've tried the na?ve: > > >>> c = b.view(dtype='float').reshape(b.size,-1) > > but that fails with: > > ValueError: new type not compatible with array. > > I understand why this would fail (as it is a view and not a copy), but I'm lost on a method to do this conversion simply. > It may not be as simple as you would like, but the following works efficiently: import numpy as np a = np.array([(1.0, 2), (3.0, 4)], dtype=[('x', float), ('y', int)]) b = np.empty((a.shape[0], 2), dtype=np.float) b[:,0] = a['x'] b[:,1] = a['y'] Eric > thanks, matt From nadavh at visionsense.com Thu May 13 08:18:37 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 13 May 2010 15:18:37 +0300 Subject: [Numpy-discussion] savetxt not working with python3.1 Message-ID: <710F2847B0018641891D9A21602763605AD3FD@ex3.envision.co.il> in module npyio.py lines 794,796 "file" should be replaced by "_file" Nadav From gael.varoquaux at normalesup.org Thu May 13 10:10:15 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 13 May 2010 16:10:15 +0200 Subject: [Numpy-discussion] EuroScipy is finally open for registration Message-ID: <4BEC0847.2020501@normalesup.org> Registration for EuroScipy is finally open To register, go to the website , create an account, and you will see a /?register to the conference?/ button on the left. Follow it to a page which presents a /?shoping cart?/. Simply submitting this information registers you to the conference, and on the left of the website, the button will now display /?You are registered for the conference?/. The registration fee is 50 euros for the conference, and 50 euros for the tutorial. Right now there is no payment system: you will be contacted later (in a week) with instructions for paying. We apologize for such a late set up. We do realize this has come as an inconvenience to people. *Do not wait to register: the number of people we can host is limited.* An exciting program Tutorials: from beginners to experts We have two tutorial tracks: * *Introductory tutorial* : to get you to speed on scientific programming with Python. * *Advanced tutorial* : experts sharing their knowledge on specific techniques and libraries. We are very fortunate to have a top notch set of presenters. Scientific track: doing new science in Python Although the abstract submission is not yet over, We can say that we are going to have a rich set of talks, looking at the current submissions. In addition to the contributed talks, we have: * *Keynote speakers* : Hans Petter Langtangen and Konrard Hinsen, two major player of scientific computing in Python. * *Lightning talks* : one hour will be open for people to come up and present in a flash an interesting project. Publishing papers We are talking with the editors of a major scientific computing journal, and the odds are quite high that we will be able to publish a special issue on scientific computing in Python based on the proceedings of the conference. The papers will undergo peer-review independently from the conference, to ensure high quality of the final publication. Call for papers Abstract submission is still open, though not for long. We are soliciting contributions on scientific libraries and tools developed with Python and on scientific or engineering achievements using Python. These include applications, teaching, future development directions, and current research. See the call for papers . *We are very much looking forward to passionate discussions about Python in science in Paris* *Nicolas Chauvat and Ga?l Varoquaux* -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicola.vianello at gmail.com Thu May 13 17:55:28 2010 From: nicola.vianello at gmail.com (Nicola) Date: Thu, 13 May 2010 23:55:28 +0200 Subject: [Numpy-discussion] mac os x installation Message-ID: Hi. I've to admit that I'm quite new in python and also to numpy. I'm trying to install numpy and scypy on my mac (Mac Os X 10.6.3). I've installed the last version of python (Python 2.6.5 Mac OS X Installer Disk Image ). After than I've downloaded the latest version of numpy ( numpy-1.4.1-py2.6-python.org.dmg) and finally also the latest version of scypy ( scipy-0.7.2-py2.6-python.org.dmg ). When I try the test I obtain the following errors Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) [GCC 4.0.1 (Apple Inc. build 5493)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test('1','10') Running unit tests for numpy Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/testing/nosetester.py", line 326, in test self._show_system_info() File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/testing/nosetester.py", line 187, in _show_system_info nose = import_nose() File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/testing/nosetester.py", line 69, in import_nose raise ImportError(msg) ImportError: Need nose >= 0.10.0 for tests - see http://somethingaboutorange.com/mrl/projects/nose >>> any idea? -- ---------------------------------------------- nicola.vianello at gmail.com skype:nicolavianello ---------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrickmarshwx at gmail.com Thu May 13 17:57:49 2010 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Thu, 13 May 2010 16:57:49 -0500 Subject: [Numpy-discussion] mac os x installation In-Reply-To: References: Message-ID: You need to install the "nose" module to run the test suite. http://code.google.com/p/python-nose/ Patrick On Thu, May 13, 2010 at 4:55 PM, Nicola wrote: > Hi. I've to admit that I'm quite new in python and also to numpy. I'm > trying to install numpy and scypy on my mac (Mac Os X 10.6.3). I've > installed the last version of python (Python 2.6.5 Mac OS X Installer Disk > Image > ). > After than I've downloaded the latest version of numpy ( > numpy-1.4.1-py2.6-python.org.dmg) > and finally also the latest version of scypy ( > scipy-0.7.2-py2.6-python.org.dmg > ). > When I try the test I obtain the following errors > > Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) > [GCC 4.0.1 (Apple Inc. build 5493)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy > >>> numpy.test('1','10') > Running unit tests for numpy > Traceback (most recent call last): > File "", line 1, in > File > "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/testing/nosetester.py", > line 326, in test > self._show_system_info() > File > "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/testing/nosetester.py", > line 187, in _show_system_info > nose = import_nose() > File > "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/testing/nosetester.py", > line 69, in import_nose > raise ImportError(msg) > ImportError: Need nose >= 0.10.0 for tests - see > http://somethingaboutorange.com/mrl/projects/nose > >>> > > any idea? > > -- > ---------------------------------------------- > nicola.vianello at gmail.com > skype:nicolavianello > ---------------------------------------------- > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicola.vianello at gmail.com Thu May 13 18:07:22 2010 From: nicola.vianello at gmail.com (Nicola) Date: Fri, 14 May 2010 00:07:22 +0200 Subject: [Numpy-discussion] mac os x installation In-Reply-To: References: Message-ID: thank you for the information but now the error is the following Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) [GCC 4.0.1 (Apple Inc. build 5493)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test('1','10') Running unit tests for numpy NumPy version 1.4.1 NumPy is installed in /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy Python version 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) [GCC 4.0.1 (Apple Inc. build 5493)] nose version 0.11.3 Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/testing/nosetester.py", line 335, in test t = NumpyTestProgram(argv=argv, exit=False, plugins=plugins) File "nose/core.py", line 117, in __init__ **extra_args) File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/unittest.py", line 816, in __init__ self.parseArgs(argv) File "nose/core.py", line 134, in parseArgs self.config.configure(argv, doc=self.usage()) File "nose/config.py", line 273, in configure options, args = self._parseArgs(argv, cfg_files) File "nose/config.py", line 261, in _parseArgs return parser.parseArgsAndConfigFiles(argv[1:], cfg_files) File "nose/config.py", line 132, in parseArgsAndConfigFiles self._applyConfigurationToValues(self._parser, config, values) File "nose/config.py", line 118, in _applyConfigurationToValues name=name, filename=filename) File "nose/config.py", line 258, in warn_sometimes raise ConfigError(msg) nose.config.ConfigError: Error reading config file 'setup.cfg': no such option 'doctest-extension' >>> On Thu, May 13, 2010 at 11:57 PM, Patrick Marsh wrote: > You need to install the "nose" module to run the test suite. > > http://code.google.com/p/python-nose/ > > > Patrick > > On Thu, May 13, 2010 at 4:55 PM, Nicola wrote: > >> Hi. I've to admit that I'm quite new in python and also to numpy. I'm >> trying to install numpy and scypy on my mac (Mac Os X 10.6.3). I've >> installed the last version of python (Python 2.6.5 Mac OS X Installer >> Disk Image >> ). >> After than I've downloaded the latest version of numpy ( >> numpy-1.4.1-py2.6-python.org.dmg) >> and finally also the latest version of scypy ( >> scipy-0.7.2-py2.6-python.org.dmg >> ). >> When I try the test I obtain the following errors >> >> Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) >> [GCC 4.0.1 (Apple Inc. build 5493)] on darwin >> Type "help", "copyright", "credits" or "license" for more information. >> >>> import numpy >> >>> numpy.test('1','10') >> Running unit tests for numpy >> Traceback (most recent call last): >> File "", line 1, in >> File >> "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/testing/nosetester.py", >> line 326, in test >> self._show_system_info() >> File >> "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/testing/nosetester.py", >> line 187, in _show_system_info >> nose = import_nose() >> File >> "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/testing/nosetester.py", >> line 69, in import_nose >> raise ImportError(msg) >> ImportError: Need nose >= 0.10.0 for tests - see >> http://somethingaboutorange.com/mrl/projects/nose >> >>> >> >> any idea? >> >> -- >> ---------------------------------------------- >> nicola.vianello at gmail.com >> skype:nicolavianello >> ---------------------------------------------- >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > Patrick Marsh > Ph.D. Student / NSSL Liaison to the HWT > School of Meteorology / University of Oklahoma > Cooperative Institute for Mesoscale Meteorological Studies > National Severe Storms Laboratory > http://www.patricktmarsh.com > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- ---------------------------------------------- nicola.vianello at gmail.com skype:nicolavianello ---------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From porterj at alum.rit.edu Thu May 13 19:34:11 2010 From: porterj at alum.rit.edu (Jim Porter) Date: Thu, 13 May 2010 18:34:11 -0500 Subject: [Numpy-discussion] zeros_like and friends shouldn't use ndarray.__new__(type(a), ...) Message-ID: <4BEC8C73.3050106@alum.rit.edu> Ok, let's try sending this message again, since it looks like I can't send from gmane... (See discussion on python-list at http://permalink.gmane.org/gmane.comp.python.general/661328 for context) numpy.zeros_like contains the following code: def zeros_like(a): if isinstance(a, ndarray): res = ndarray.__new__(type(a), a.shape, a.dtype, order=a.flags.fnc) res.fill(0) return res ... This is a problem because basetype.__new__(subtype, ...) raises an exception when subtype is defined from C (specifically, when Py_TPFLAGS_HEAPTYPE is not set). There's a check in Objects/typeobject.c in tp_new_wrapper that disallows this (you can grep for "is not safe" to find there the exception is raised). The end result is that it's impossible to use zeros_like, ones_like or empty_like with ndarray subtypes defined in C. While I'm still not sure why Python needs this check in general, Robert Kern pointed out that the problem can be fixed pretty easily in NumPy by changing zeros_like and friends to something like this (with some modifications from me): def zeros_like(a): if isinstance(a, ndarray): res = numpy.zeros(a.shape, a.dtype, order=a.flags.fnc) res = res.view(type(a)) res.__array_finalize__(a) return res ... - Jim From vincent at vincentdavis.net Thu May 13 23:06:41 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Thu, 13 May 2010 21:06:41 -0600 Subject: [Numpy-discussion] missing='' not documented in genfromtxt() Message-ID: Maybe I am missing something but it does not appear that missing='' is documented although it is shown s an argument ? genfromtxt(fname, dtype=float, comments='#', delimiter=None, skiprows=0, skip_header=0, skip_footer=0, converters=None, missing='', missing_values=None, filling_values=None, usecols=None, names=None, excludelist=None, deletechars=None, autostrip=False, case_sensitive=True, defaultfmt="f%i", unpack=None, usemask=False, loose=True, invalid_raise=True): From pgmdevlist at gmail.com Thu May 13 23:33:41 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 13 May 2010 23:33:41 -0400 Subject: [Numpy-discussion] missing='' not documented in genfromtxt() In-Reply-To: References: Message-ID: <7EBE1879-A574-4BE6-8DDB-478387431D65@gmail.com> On May 13, 2010, at 11:06 PM, Vincent Davis wrote: > Maybe I am missing something but it does not appear that missing='' is > documented although it is shown s an argument ? Because the use of `missing` is deprecated (try to use anything but '' for missing, and you'll get a deprecation warning). Use `missing_values` instead. From vincent at vincentdavis.net Thu May 13 23:51:17 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Thu, 13 May 2010 21:51:17 -0600 Subject: [Numpy-discussion] missing='' not documented in genfromtxt() In-Reply-To: <7EBE1879-A574-4BE6-8DDB-478387431D65@gmail.com> References: <7EBE1879-A574-4BE6-8DDB-478387431D65@gmail.com> Message-ID: > Because the use of `missing` is deprecated (try to use anything but '' for missing, and you'll get a deprecation warning). > Use `missing_values` instead. I wasn't using 'missing' but was wondering what it did. @Pierre, St?fan van der Walt suggested that genfromtxt was your baby :) Anyway I have a bug reported for genfromtxt with the fix St?fan used on recfromcsv, I have thought about addressing it but was giving you a chance, but I didn't know who "you" where, http://projects.scipy.org/numpy/ticket/1473 Thanks Vincent On Thu, May 13, 2010 at 9:33 PM, Pierre GM wrote: > On May 13, 2010, at 11:06 PM, Vincent Davis wrote: >> Maybe I am missing something but it does not appear that missing='' is >> documented although it is shown s an argument ? > > Because the use of `missing` is deprecated (try to use anything but '' for missing, and you'll get a deprecation warning). > Use `missing_values` instead. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From opossumnano at gmail.com Fri May 14 13:43:14 2010 From: opossumnano at gmail.com (Tiziano Zito) Date: Fri, 14 May 2010 19:43:14 +0200 Subject: [Numpy-discussion] ANN: MDP release 2.6 and MDP Sprint 2010 Message-ID: <20100514174314.GD29048@multivac.zonafranca> We are glad to announce release 2.6 of the Modular toolkit for Data Processing (MDP). MDP is a Python library of widely used data processing algorithms that can be combined according to a pipeline analogy to build more complex data processing software. The base of available algorithms includes, to name but the most common, Principal Component Analysis (PCA and NIPALS), several Independent Component Analysis algorithms (CuBICA, FastICA, TDSEP, JADE, and XSFA), Slow Feature Analysis, Restricted Boltzmann Machine, and Locally Linear Embedding. What's new in version 2.6? -------------------------- - Several new classifier nodes have been added. - A new node extension mechanism makes it possible to dynamically add methods or attributes for specific features to node classes, enabling aspect-oriented programming in MDP. Several MDP features (like parallelization) are now based on this mechanism, and users can add their own custom node extensions. - BiMDP is a large new package in MDP that introduces bidirectional data flows to MDP, including backpropagation and even loops. BiMDP also enables the transportation of additional data in flows via messages. - BiMDP includes a new flow inspection tool, that runs as as a graphical debugger in the webrowser to step through complex flows. It can be extended by users for the analysis and visualization of intermediate data. - As usual, tons of bug fixes The new additions in the library have been thoroughly tested but, as usual after a public release, we especially welcome user's feedback and bug reports. MDP Sprint 2010 --------------- Following our tradition of sprint-driven development, the team of the core developers decided to organize a programming sprint open to external participants. We invite in particular all users who implemented new algorithms and would like to see them integrated in MDP: you will work together with a core developer! More info: http://sourceforge.net/apps/mediawiki/mdp-toolkit/index.php?title=MDP_Sprint_2010 Resources --------- Download: http://sourceforge.net/projects/mdp-toolkit/files Homepage: http://mdp-toolkit.sourceforge.net Mailing list: http://lists.sourceforge.net/mailman/listinfo/mdp-toolkit-users -- Pietro Berkes Volen Center for Complex Systems Brandeis University Waltham, MA, USA Rike-Benjamin Schuppner Berlin, Germany Niko Wilbert Institute for Theoretical Biology Humboldt-University Berlin, Germany Tiziano Zito Modelling of Cognitive Processes Berlin Institute of Technology and Bernstein Center for Computational Neuroscience Berlin, Germany From robert.kern at gmail.com Fri May 14 13:47:46 2010 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 May 2010 13:47:46 -0400 Subject: [Numpy-discussion] default behavior of argsort In-Reply-To: <28542476.post@talk.nabble.com> References: <28542476.post@talk.nabble.com> Message-ID: On Wed, May 12, 2010 at 20:19, Dr. Phillip M. Feldman wrote: > > When operating on an array whose last dimension is unity, the default > behavior of argsort is not very useful: > > |6> x=random.random((4,1)) > |7> shape(x) > ? ? ? ? ? ? ? ? ? ? ?<7> (4, 1) > |8> argsort(x) > ? ? ? ? ? ? ? ? ? ? ?<8> > array([[0], > ? ? ? [0], > ? ? ? [0], > ? ? ? [0]]) > |9> argsort(x,axis=0) > ? ? ? ? ? ? ? ? ? ? ?<9> > array([[0], > ? ? ? [2], > ? ? ? [1], > ? ? ? [3]]) Sorry, but I don't think we are going to add a special case for this. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Fri May 14 14:06:31 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 14 May 2010 14:06:31 -0400 Subject: [Numpy-discussion] missing='' not documented in genfromtxt() In-Reply-To: References: <7EBE1879-A574-4BE6-8DDB-478387431D65@gmail.com> Message-ID: On May 13, 2010, at 11:51 PM, Vincent Davis wrote: >> Because the use of `missing` is deprecated (try to use anything but '' for missing, and you'll get a deprecation warning). >> Use `missing_values` instead. > > I wasn't using 'missing' but was wondering what it did. > > @Pierre, St?fan van der Walt suggested that genfromtxt was your baby > :) I'd say *creature* ;). Initially, it's a rip-off from an equivalent function in matplotlib (props to John D. Hunter for the original), but it got reorganized and patched over the months... > Anyway I have a bug reported for genfromtxt with the fix St?fan > used on recfromcsv, I have thought about addressing it but was giving > you a chance, but I didn't know who "you" where, > > http://projects.scipy.org/numpy/ticket/1473 I followed the discussion on pystatmodels from a distance. The explanation that was given on why you get a IOError looks like the correct one indeed, but I'll investigate that further on that this week-end... From bblais at bryant.edu Fri May 14 14:43:19 2010 From: bblais at bryant.edu (Brian Blais) Date: Fri, 14 May 2010 14:43:19 -0400 Subject: [Numpy-discussion] memory leak? Message-ID: Hello, I have the following code, where I noticed a memory leak with +=, but not with + alone. import numpy m=numpy.matrix(numpy.ones((23,23))) for i in range(10000000): m+=0.0 # keeps growing in memory # m=m+0.0 # is stable in memory My version of python is 2.5, numpy 1.3.0, but it also causes memory build-up in 2.6 with numpy 1.4.0, as distributed by the Enthought Python Distribution. It's easy to work around, but could cause someone some problems. Anyone else get this? bb -- Brian Blais bblais at bryant.edu http://web.bryant.edu/~bblais http://bblais.blogspot.com/ From josef.pktd at gmail.com Fri May 14 15:26:17 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 14 May 2010 15:26:17 -0400 Subject: [Numpy-discussion] memory leak? In-Reply-To: References: Message-ID: On Fri, May 14, 2010 at 2:43 PM, Brian Blais wrote: > Hello, > > I have the following code, where I noticed a memory leak with +=, but > not with + alone. > import numpy > > m=numpy.matrix(numpy.ones((23,23))) > > for i in range(10000000): > ? ? m+=0.0 ?# keeps growing in memory > ? ? # ? ?m=m+0.0 ?# is stable in memory > > > My version of python is 2.5, numpy 1.3.0, but it also causes memory > build-up in 2.6 with numpy 1.4.0, as distributed by the Enthought > Python Distribution. > > It's easy to work around, but could cause someone some problems. > Anyone else get this? I get it also with python 2.5 numpy 1.4.0 Who owns the data ? >>> m=np.matrix(np.ones((3,3))) >>> m.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> m+=0 >>> m.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False <- GONE WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False Josef > > > ? ? ? ? ? ? ? ? ? ? ? ?bb > > -- > Brian Blais > bblais at bryant.edu > http://web.bryant.edu/~bblais > http://bblais.blogspot.com/ > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Fri May 14 16:03:56 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 14 May 2010 16:03:56 -0400 Subject: [Numpy-discussion] memory leak? In-Reply-To: References: Message-ID: On Fri, May 14, 2010 at 3:26 PM, wrote: > On Fri, May 14, 2010 at 2:43 PM, Brian Blais wrote: >> Hello, >> >> I have the following code, where I noticed a memory leak with +=, but >> not with + alone. >> import numpy >> >> m=numpy.matrix(numpy.ones((23,23))) >> >> for i in range(10000000): >> ? ? m+=0.0 ?# keeps growing in memory >> ? ? # ? ?m=m+0.0 ?# is stable in memory >> >> >> My version of python is 2.5, numpy 1.3.0, but it also causes memory >> build-up in 2.6 with numpy 1.4.0, as distributed by the Enthought >> Python Distribution. >> >> It's easy to work around, but could cause someone some problems. >> Anyone else get this? > > I get it also with python 2.5 numpy 1.4.0 > > Who owns the data ? > >>>> m=np.matrix(np.ones((3,3))) >>>> m.flags > ?C_CONTIGUOUS : True > ?F_CONTIGUOUS : False > ?OWNDATA : True > ?WRITEABLE : True > ?ALIGNED : True > ?UPDATEIFCOPY : False > >>>> m+=0 >>>> m.flags > ?C_CONTIGUOUS : True > ?F_CONTIGUOUS : False > ?OWNDATA : False ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? <- GONE > ?WRITEABLE : True > ?ALIGNED : True > ?UPDATEIFCOPY : False > > Josef > Maybe it's not a "true" memory leak, my python process eventually garbage collected the extra memory that was built up. Josef >> >> >> ? ? ? ? ? ? ? ? ? ? ? ?bb >> >> -- >> Brian Blais >> bblais at bryant.edu >> http://web.bryant.edu/~bblais >> http://bblais.blogspot.com/ >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > From pfeldman at verizon.net Fri May 14 17:03:46 2010 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Fri, 14 May 2010 14:03:46 -0700 (PDT) Subject: [Numpy-discussion] default behavior of argsort In-Reply-To: References: <28542476.post@talk.nabble.com> Message-ID: <28564261.post@talk.nabble.com> Robert Kern-2 wrote: > > On Wed, May 12, 2010 at 20:19, Dr. Phillip M. Feldman > wrote: >> >> When operating on an array whose last dimension is unity, the default >> behavior of argsort is not very useful: >> >> |6> x=random.random((4,1)) >> |7> shape(x) >> ? ? ? ? ? ? ? ? ? ? ?<7> (4, 1) >> |8> argsort(x) >> ? ? ? ? ? ? ? ? ? ? ?<8> >> array([[0], >> ? ? ? [0], >> ? ? ? [0], >> ? ? ? [0]]) >> |9> argsort(x,axis=0) >> ? ? ? ? ? ? ? ? ? ? ?<9> >> array([[0], >> ? ? ? [2], >> ? ? ? [1], >> ? ? ? [3]]) > > Sorry, but I don't think we are going to add a special case for this. > > -- > Robert Kern > I don't see this as a special case. When axis is unspecified, the default is axis=-1, which causes argsort to operate on the last dimension. A more sensible default would be the last non-unity dimension. Phillip -- View this message in context: http://old.nabble.com/default-behavior-of-argsort-tp28542476p28564261.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From efiring at hawaii.edu Fri May 14 17:29:32 2010 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 14 May 2010 11:29:32 -1000 Subject: [Numpy-discussion] default behavior of argsort In-Reply-To: <28564261.post@talk.nabble.com> References: <28542476.post@talk.nabble.com> <28564261.post@talk.nabble.com> Message-ID: <4BEDC0BC.4080703@hawaii.edu> On 05/14/2010 11:03 AM, Dr. Phillip M. Feldman wrote: > > > > Robert Kern-2 wrote: >> >> On Wed, May 12, 2010 at 20:19, Dr. Phillip M. Feldman >> wrote: >>> >>> When operating on an array whose last dimension is unity, the default >>> behavior of argsort is not very useful: >>> >>> |6> x=random.random((4,1)) >>> |7> shape(x) >>> <7> (4, 1) >>> |8> argsort(x) >>> <8> >>> array([[0], >>> [0], >>> [0], >>> [0]]) >>> |9> argsort(x,axis=0) >>> <9> >>> array([[0], >>> [2], >>> [1], >>> [3]]) >> >> Sorry, but I don't think we are going to add a special case for this. >> >> -- >> Robert Kern >> > > I don't see this as a special case. When axis is unspecified, the default > is axis=-1, which causes argsort to operate on the last dimension. A more > sensible default would be the last non-unity dimension. That would be too clever for my liking. First, a default should be something that can also be explicitly specified; how would you use the axis kwarg to specify the last non-unit dimension? Second, treating a unit dimension differently from a non-unit dimension *is* making it a special case, and often--usually--one does not want that. It is perfectly reasonable to have an algorithm that uses values sorted along the last axis, even if that dimension sometimes turns out to be one. Eric > > Phillip From vincent at vincentdavis.net Fri May 14 17:40:11 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Fri, 14 May 2010 15:40:11 -0600 Subject: [Numpy-discussion] recarray question Message-ID: The setup: >>> Adata array([(1, 24, 'Male', '', 212, 193, 'High Pass'), (2, 26, 'Male', 'Caucasian', 234, 221, 'Honors'), (3, 31, 'Female', 'Caucasian', 182, 189, ''), (4, 27, 'Female', 'Hispanic', 214, 211, 'High Pass'), (5, 27, 'Female', 'Asian', 213, 204, 'Pass'), (6, 29, 'Female', 'Caucasian', 209, -1, 'High Pass'), (7, 26, 'Female', 'Hispanic', 212, -1, 'Honors'), (8, 25, 'Female', 'Caucasian', 230, 238, 'Honors'), (9, 27, 'Female', 'Caucasian', 239, 245, 'Honors'), (10, 26, 'Male', 'Caucasian', 226, -1, 'Honors')], dtype=[('ID', '>> BData array([(1,), (1,), (0,), (0,), (0,), (0,), (0,), (0,), (0,), (1,)], dtype=[('Gender', '>> Adata['Gender'] = Bdata['Gender'] >>> Adata array([(1, 24, '1', '', 212, 193, 'High Pass'), (2, 26, '1', 'Caucasian', 234, 221, 'Honors'), (3, 31, '0', 'Caucasian', 182, 189, ''), (4, 27, '0', 'Hispanic', 214, 211, 'High Pass'), (5, 27, '0', 'Asian', 213, 204, 'Pass'), (6, 29, '0', 'Caucasian', 209, -1, 'High Pass'), (7, 26, '0', 'Hispanic', 212, -1, 'Honors'), (8, 25, '0', 'Caucasian', 230, 238, 'Honors'), (9, 27, '0', 'Caucasian', 239, 245, 'Honors'), (10, 26, '1', 'Caucasian', 226, -1, 'Honors')], dtype=[('ID', ' From robert.kern at gmail.com Fri May 14 17:53:41 2010 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 May 2010 17:53:41 -0400 Subject: [Numpy-discussion] default behavior of argsort In-Reply-To: <4BEDC0BC.4080703@hawaii.edu> References: <28542476.post@talk.nabble.com> <28564261.post@talk.nabble.com> <4BEDC0BC.4080703@hawaii.edu> Message-ID: On Fri, May 14, 2010 at 17:29, Eric Firing wrote: > On 05/14/2010 11:03 AM, Dr. Phillip M. Feldman wrote: >> >> Robert Kern-2 wrote: >>> >>> On Wed, May 12, 2010 at 20:19, Dr. Phillip M. Feldman >>> ?wrote: >>>> >>>> When operating on an array whose last dimension is unity, the default >>>> behavior of argsort is not very useful: >>>> >>>> |6> ?x=random.random((4,1)) >>>> |7> ?shape(x) >>>> ? ? ? ? ? ? ? ? ? ? ? <7> ?(4, 1) >>>> |8> ?argsort(x) >>>> ? ? ? ? ? ? ? ? ? ? ? <8> >>>> array([[0], >>>> ? ? ? ?[0], >>>> ? ? ? ?[0], >>>> ? ? ? ?[0]]) >>>> |9> ?argsort(x,axis=0) >>>> ? ? ? ? ? ? ? ? ? ? ? <9> >>>> array([[0], >>>> ? ? ? ?[2], >>>> ? ? ? ?[1], >>>> ? ? ? ?[3]]) >>> >>> Sorry, but I don't think we are going to add a special case for this. >>> >>> -- >>> Robert Kern >>> >> >> I don't see this as a special case. ?When axis is unspecified, the default >> is axis=-1, which causes argsort to operate on the last dimension. ?A more >> sensible default would be the last non-unity dimension. > > That would be too clever for my liking. ?First, a default should be > something that can also be explicitly specified; how would you use the > axis kwarg to specify the last non-unit dimension? None would be reasonable. Unfortunately, that would be inconsistent with the interpretation of other axis=None arguments elsewhere in numpy. >?Second, treating a > unit dimension differently from a non-unit dimension *is* making it a > special case, and often--usually--one does not want that. ?It is > perfectly reasonable to have an algorithm that uses values sorted along > the last axis, even if that dimension sometimes turns out to be one. Right. Changing behavior on an edge case makes everyone else have to deal with that edge case in their code. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Fri May 14 21:05:14 2010 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri, 14 May 2010 18:05:14 -0700 Subject: [Numpy-discussion] default behavior of argsort In-Reply-To: References: <28542476.post@talk.nabble.com> <28564261.post@talk.nabble.com> <4BEDC0BC.4080703@hawaii.edu> Message-ID: <4BEDF34A.7070904@noaa.gov> >> Second, treating a >> unit dimension differently from a non-unit dimension *is* making it a >> special case, and often--usually--one does not want that. It is >> perfectly reasonable to have an algorithm that uses values sorted along >> the last axis, even if that dimension sometimes turns out to be one. > > Right. Changing behavior on an edge case makes everyone else have to > deal with that edge case in their code. not to hammer a point home (OK, it IS to hammer a point home), this is one of the things that drove me crazy about MATLAB -- everything was a 2-d array, unless one dimension happened to have length 1, and then some (but not all) functions treated it as 1-d. I had to write me own version of sum() for instance, that would sum over the last dimension, even if it happened to be 1. numpy provides n-d arrays, so you don't have that silliness -- if you want a 1-d array, use a 1-d array. I can't find it right now, but I'm pretty sure there is a function that will re-shape an array to remove the length-1 dimensions -- maybe that's what the OP needs. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Fri May 14 21:09:58 2010 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri, 14 May 2010 18:09:58 -0700 Subject: [Numpy-discussion] default behavior of argsort In-Reply-To: <4BEDF34A.7070904@noaa.gov> References: <28542476.post@talk.nabble.com> <28564261.post@talk.nabble.com> <4BEDC0BC.4080703@hawaii.edu> <4BEDF34A.7070904@noaa.gov> Message-ID: <4BEDF466.1060007@noaa.gov> Chris Barker wrote: > I can't find it right now, but I'm pretty sure there is a function that > will re-shape an array to remove the length-1 dimensions -- maybe that's > what the OP needs. it's np.squeeze() -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From bblais at bryant.edu Fri May 14 22:24:15 2010 From: bblais at bryant.edu (Brian Blais) Date: Fri, 14 May 2010 22:24:15 -0400 Subject: [Numpy-discussion] memory leak? In-Reply-To: References: Message-ID: On May 14, 2010, at 16:03 , josef.pktd at gmail.com wrote: > On Fri, May 14, 2010 at 3:26 PM, wrote: >> On Fri, May 14, 2010 at 2:43 PM, Brian Blais >> wrote: >>> Hello, >>> >>> I have the following code, where I noticed a memory leak with +=, >>> but >>> not with + alone. >>> import numpy >>> >>> m=numpy.matrix(numpy.ones((23,23))) >>> >>> for i in range(10000000): >>> m+=0.0 # keeps growing in memory >>> # m=m+0.0 # is stable in memory >>> >>> > Maybe it's not a "true" memory leak, my python process eventually > garbage collected the extra memory that was built up. > It crashed a simulator of mine (at least in Windows), until I figured out the workaround, so I would consider it a leak. :) it certainly shouldn't grow like that. if anything m=m+0.0 should chew of *more* memory than m+=0.0. bb -- Brian Blais bblais at bryant.edu http://web.bryant.edu/~bblais http://bblais.blogspot.com/ From jlhouchin at gmail.com Sat May 15 00:24:03 2010 From: jlhouchin at gmail.com (Jimmie Houchin) Date: Fri, 14 May 2010 23:24:03 -0500 Subject: [Numpy-discussion] Problems creating numpy.array with a dtype Message-ID: Hello, I am really liking Numpy a lot. It is wonderful to be able to do the things that it does in a language as friendly as Python, and with the performance Numpy delivers over standard Python. Thanks. I am having a problem with creation of Numpy arrays with my generated dtypes. I am creating a dataset of a weeks worth of financial instruments data, in order to explore and test relationships with various Technical Analysis functions. My basic data is simply a list (or tuple) of lists (or tuples). ((startdate, bidopen, bidhigh, bidlow, bidclose, askopen, askhigh, asklow, askclose), ...) Nothing unusual. However I am creating arrays which have many, many more columns to allow storing the data generated from applying the functions to the original data. I have created two functions. One to dynamically create the dtype based on data I want to create for the exploration. And another to create the array and populate it with the initial data from a database. Code slightly modified, not tested. #examples taFunctions = (smva, wmva) inputColumns = (bidclose, ohlcavg) def createDType(): """Will create a dtype based on the pattern for naming and the parameters of those items being stored in the array. """ dttypes = [('startdate','object'), ('bidopen','f8'), ('bidhigh','f8'), ('bidlow','f8'), ('bidclose','f8'), ('askopen','f8'), ('askhigh','f8'), ('asklow','f8'), ('askclose','f8'), ('ocavg','f8'), ('hlavg','f8'), ('ohlavg','f8'), ('ohlcavg','f8'), ('direction','i1'), ('volatility', 'f8'), ('spread', 'f8'), ('pivot', 'S4')] for f in taFunctions: for i in inputColumns: dttypes.append((f+"-"+i,'f8')) dtminute = np.dtype(dttypes) return dtminute, dttypes def getArray(instrument, weekString=None): ... cur.execute(sql) weekData = cur.fetchall() wdata = [] lst = [] dtminute, dttypes = createDType() for i in dttypes: if i[1] == 'f8': lst.append(0.0) elif i[1] == 'i1': lst.append(0) else: lst.append('') for m in weekData: data = list(m)+lst[9:] wdata.append(data) return np.array(wdata,dtype=dtminute) The createDType() function works fine. The getArray() function fails with: ValueError: Setting void-array with object members using buffer. However changing the getArray() function to this works just fine. def getArray(instrument, weekString=None): ... cur.execute(sql) weekData = cur.fetchall() arrayLength = len(weekData) lst = [] dtminute, dttypes = createDType() for i in dttypes: if i[1] == 'f8': lst.append(0.0) elif i[1] == 'i1': lst.append(0) else: lst.append('') listLength = len(lst) weekArray = np.zeros(arrayLength, dtype=dtminute) for i in range(arrayLength): for j in range(listLength): if j < 9: weekArray[i][j] = weekData[i][j] else: weekArray[i][j] = lst[j] return weekArray After I finally worked out getArray number two I am back in business writing the rest of my app. But I banged my head on version number one for quite some time trying to figure out what I am doing wrong. I still don't know. I find no errors in my data length or types. I would thing that either would cause version two to fail also. In help in understanding is greatly appreciated. This is using Numpy 1.4.1. Thanks. Jimmie From pfeldman at verizon.net Sat May 15 03:15:15 2010 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Sat, 15 May 2010 00:15:15 -0700 (PDT) Subject: [Numpy-discussion] default behavior of argsort In-Reply-To: <4BEDC0BC.4080703@hawaii.edu> References: <28542476.post@talk.nabble.com> <28564261.post@talk.nabble.com> <4BEDC0BC.4080703@hawaii.edu> Message-ID: <28566701.post@talk.nabble.com> efiring wrote: > > On 05/14/2010 11:03 AM, Dr. Phillip M. Feldman wrote: >> > It is perfectly reasonable to have an algorithm that uses values > sorted along > the last axis, even if that dimension sometimes turns out to be one. > > Eric > Excellent point! I agree. Case closed. Phillip -- View this message in context: http://old.nabble.com/default-behavior-of-argsort-tp28542476p28566701.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From josef.pktd at gmail.com Sat May 15 07:30:37 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 15 May 2010 07:30:37 -0400 Subject: [Numpy-discussion] Problems creating numpy.array with a dtype In-Reply-To: References: Message-ID: On Sat, May 15, 2010 at 12:24 AM, Jimmie Houchin wrote: > Hello, I am really liking Numpy a lot. It is wonderful to be able to do > the things that it does in a language as friendly as Python, and with > the performance Numpy delivers over standard Python. Thanks. > > I am having a problem with creation of Numpy arrays with my generated > dtypes. I am creating a dataset of a weeks worth of financial > instruments data, in order to explore and test relationships with > various Technical Analysis functions. > > My basic data is simply a list (or tuple) of lists (or tuples). > ((startdate, bidopen, bidhigh, bidlow, bidclose, askopen, askhigh, > asklow, askclose), ...) > > Nothing unusual. However I am creating arrays which have many, many more > columns to allow storing the data generated from applying the functions > to the original data. > > I have created two functions. One to dynamically create the dtype based > on data I want to create for the exploration. And another to create the > array and populate it with the initial data from a database. > > Code slightly modified, not tested. > > #examples > taFunctions = (smva, wmva) > inputColumns = (bidclose, ohlcavg) > > def createDType(): > ? ? """Will create a dtype based on the pattern for naming and the > ? ? ? ?parameters of those items being stored in the array. > ? ? """ > ? ? dttypes = [('startdate','object'), > ? ? ? ? ('bidopen','f8'), ('bidhigh','f8'), > ? ? ? ? ('bidlow','f8'), ('bidclose','f8'), > ? ? ? ? ('askopen','f8'), ('askhigh','f8'), > ? ? ? ? ('asklow','f8'), ('askclose','f8'), > ? ? ? ? ('ocavg','f8'), ('hlavg','f8'), ('ohlavg','f8'), > ? ? ? ? ('ohlcavg','f8'), ('direction','i1'), ('volatility', 'f8'), > ? ? ? ? ('spread', 'f8'), ('pivot', 'S4')] > ? ? for f in taFunctions: > ? ? ? ? for i in inputColumns: > ? ? ? ? ? ? dttypes.append((f+"-"+i,'f8')) > ? ? dtminute = np.dtype(dttypes) > ? ? return dtminute, dttypes > > def getArray(instrument, weekString=None): > ? ? ... > ? ? cur.execute(sql) > ? ? weekData = cur.fetchall() > ? ? wdata = [] > ? ? lst = [] > ? ? dtminute, dttypes = createDType() > ? ? for i in dttypes: > ? ? ? ? if i[1] == 'f8': lst.append(0.0) > ? ? ? ? elif i[1] == 'i1': lst.append(0) > ? ? ? ? else: lst.append('') > ? ? for m in weekData: > ? ? ? ? data = list(m)+lst[9:] > ? ? ? ? wdata.append(data) I think "data" here should be a tuple, i.e. tuple(data) structured arrays expect tuples for each element/row If this is not it, then you could provide a mini example of wdata with just a few rows. > ? ? return np.array(wdata,dtype=dtminute) > > The createDType() function works fine. The getArray() function fails with: > ValueError: Setting void-array with object members using buffer. cryptic exceptions messages in array construction usually means there is some structure in the argument data that numpy doesn't understand, I usually work with trial and error for a specific example Josef > > However changing the getArray() function to this works just fine. > > def getArray(instrument, weekString=None): > ? ? ... > ? ? cur.execute(sql) > ? ? weekData = cur.fetchall() > ? ? arrayLength = len(weekData) > ? ? lst = [] > ? ? dtminute, dttypes = createDType() > ? ? for i in dttypes: > ? ? ? ? if i[1] == 'f8': lst.append(0.0) > ? ? ? ? elif i[1] == 'i1': lst.append(0) > ? ? ? ? else: lst.append('') > ? ? listLength = len(lst) > ? ? weekArray = np.zeros(arrayLength, dtype=dtminute) > ? ? for i in range(arrayLength): > ? ? ? ? for j in range(listLength): > ? ? ? ? ? ? if j < 9: weekArray[i][j] = weekData[i][j] > ? ? ? ? ? ? else: weekArray[i][j] = lst[j] > ? ? return weekArray > > After I finally worked out getArray number two I am back in business > writing the rest of my app. But I banged my head on version number one > for quite some time trying to figure out what I am doing wrong. I still > don't know. > > I find no errors in my data length or types. I would thing that either > would cause version two to fail also. > > In help in understanding is greatly appreciated. > > This is using Numpy 1.4.1. > > Thanks. > > Jimmie > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jlhouchin at gmail.com Sat May 15 09:27:34 2010 From: jlhouchin at gmail.com (Jimmie Houchin) Date: Sat, 15 May 2010 08:27:34 -0500 Subject: [Numpy-discussion] Problems creating numpy.array with a dtype In-Reply-To: References: Message-ID: On 5/15/2010 6:30 AM, josef.pktd at gmail.com wrote: > On Sat, May 15, 2010 at 12:24 AM, Jimmie Houchin wrote: >> def getArray(instrument, weekString=None): >> ... >> cur.execute(sql) >> weekData = cur.fetchall() >> wdata = [] >> lst = [] >> dtminute, dttypes = createDType() >> for i in dttypes: >> if i[1] == 'f8': lst.append(0.0) >> elif i[1] == 'i1': lst.append(0) >> else: lst.append('') >> for m in weekData: >> data = list(m)+lst[9:] >> wdata.append(data) > > I think "data" here should be a tuple, i.e. tuple(data) > structured arrays expect tuples for each element/row > > If this is not it, then you could provide a mini example of wdata with > just a few rows. > >> return np.array(wdata,dtype=dtminute) >> >> The createDType() function works fine. The getArray() function fails with: >> ValueError: Setting void-array with object members using buffer. > > cryptic exceptions messages in array construction usually means there > is some structure in the argument data that numpy doesn't understand, > I usually work with trial and error for a specific example > > Josef Hello Josef, Wrapping data, tuple(list(m)+lst[9:]) works. Thanks. For some reason I was under the impression that numpy accepted either lists or tuples as long as the shape of the structure, and the data types was the same as the dtype array structure that it is filling. Is there a particular reason this is not so? Again, thanks. I can now get rid of my moderately less elegant, but working second version. Jimmie From josef.pktd at gmail.com Sat May 15 09:37:17 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 15 May 2010 09:37:17 -0400 Subject: [Numpy-discussion] Problems creating numpy.array with a dtype In-Reply-To: References: Message-ID: On Sat, May 15, 2010 at 9:27 AM, Jimmie Houchin wrote: > On 5/15/2010 6:30 AM, josef.pktd at gmail.com wrote: >> On Sat, May 15, 2010 at 12:24 AM, Jimmie Houchin ?wrote: >>> def getArray(instrument, weekString=None): >>> ? ? ?... >>> ? ? ?cur.execute(sql) >>> ? ? ?weekData = cur.fetchall() >>> ? ? ?wdata = [] >>> ? ? ?lst = [] >>> ? ? ?dtminute, dttypes = createDType() >>> ? ? ?for i in dttypes: >>> ? ? ? ? ?if i[1] == 'f8': lst.append(0.0) >>> ? ? ? ? ?elif i[1] == 'i1': lst.append(0) >>> ? ? ? ? ?else: lst.append('') >>> ? ? ?for m in weekData: >>> ? ? ? ? ?data = list(m)+lst[9:] >>> ? ? ? ? ?wdata.append(data) >> >> I think "data" here should be a tuple, i.e. tuple(data) >> structured arrays expect tuples for each element/row >> >> If this is not it, then you could provide a mini example of wdata with >> just a few rows. >> >>> ? ? ?return np.array(wdata,dtype=dtminute) >>> >>> The createDType() function works fine. The getArray() function fails with: >>> ValueError: Setting void-array with object members using buffer. >> >> cryptic exceptions messages in array construction usually means there >> is some structure in the argument data that numpy doesn't understand, >> I usually work with trial and error for a specific example >> >> Josef > > Hello Josef, > > Wrapping data, ? tuple(list(m)+lst[9:]) > works. > > Thanks. > > For some reason I was under the impression that numpy accepted either > lists or tuples as long as the shape of the structure, and the data > types was the same as the dtype array structure that it is filling. > Is there a particular reason this is not so? the tuple (row) is one element of the structured array. It's possible to have an n-dimensional structured array where each element is a tuple. So, I guess, numpy needs the distinction between list and tuples to know what is an element. That's from hitting at this very often, I never looked at the numpy internals for this. Josef > > Again, thanks. I can now get rid of my moderately less elegant, but > working second version. > > Jimmie > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From vincent at vincentdavis.net Sat May 15 11:15:52 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Sat, 15 May 2010 09:15:52 -0600 Subject: [Numpy-discussion] Problems creating numpy.array with a dtype In-Reply-To: References: Message-ID: > > the tuple (row) is one element of the structured array. It's possible > to have an n-dimensional structured array where each element is a > tuple. Also just was looking at this and while you can't do this anarray = np.array([1,2,3], dtype = [('num', int)]) you can anarray = np.array([(1,),(2,),(3,)], dtype = [('num', int)]) Vincent On Sat, May 15, 2010 at 7:37 AM, wrote: > On Sat, May 15, 2010 at 9:27 AM, Jimmie Houchin > wrote: > > On 5/15/2010 6:30 AM, josef.pktd at gmail.com wrote: > >> On Sat, May 15, 2010 at 12:24 AM, Jimmie Houchin > wrote: > >>> def getArray(instrument, weekString=None): > >>> ... > >>> cur.execute(sql) > >>> weekData = cur.fetchall() > >>> wdata = [] > >>> lst = [] > >>> dtminute, dttypes = createDType() > >>> for i in dttypes: > >>> if i[1] == 'f8': lst.append(0.0) > >>> elif i[1] == 'i1': lst.append(0) > >>> else: lst.append('') > >>> for m in weekData: > >>> data = list(m)+lst[9:] > >>> wdata.append(data) > >> > >> I think "data" here should be a tuple, i.e. tuple(data) > >> structured arrays expect tuples for each element/row > >> > >> If this is not it, then you could provide a mini example of wdata with > >> just a few rows. > >> > >>> return np.array(wdata,dtype=dtminute) > >>> > >>> The createDType() function works fine. The getArray() function fails > with: > >>> ValueError: Setting void-array with object members using buffer. > >> > >> cryptic exceptions messages in array construction usually means there > >> is some structure in the argument data that numpy doesn't understand, > >> I usually work with trial and error for a specific example > >> > >> Josef > > > > Hello Josef, > > > > Wrapping data, tuple(list(m)+lst[9:]) > > works. > > > > Thanks. > > > > For some reason I was under the impression that numpy accepted either > > lists or tuples as long as the shape of the structure, and the data > > types was the same as the dtype array structure that it is filling. > > Is there a particular reason this is not so? > > the tuple (row) is one element of the structured array. It's possible > to have an n-dimensional structured array where each element is a > tuple. > > So, I guess, numpy needs the distinction between list and tuples to > know what is an element. > That's from hitting at this very often, I never looked at the numpy > internals for this. > > Josef > > > > > Again, thanks. I can now get rid of my moderately less elegant, but > > working second version. > > > > Jimmie > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > *Vincent Davis 720-301-3003 * vincent at vincentdavis.net my blog | LinkedIn -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 15 12:28:20 2010 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 15 May 2010 11:28:20 -0500 Subject: [Numpy-discussion] Problems creating numpy.array with a dtype In-Reply-To: References: Message-ID: On Sat, May 15, 2010 at 08:37, wrote: > So, I guess, numpy needs the distinction between list and tuples to > know what is an element. > That's from hitting at this very often, I never looked at the numpy > internals for this. This is correct. There is only so much mind-reading that numpy.array() can do. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sccolbert at gmail.com Sat May 15 16:05:34 2010 From: sccolbert at gmail.com (S. Chris Colbert) Date: Sat, 15 May 2010 16:05:34 -0400 Subject: [Numpy-discussion] Bug with how numpy.distutils.system_info handles the site.cfg In-Reply-To: References: Message-ID: <1273953934.1747.3.camel@broo> On Wed, 2010-05-12 at 23:06 -0400, Chris Colbert wrote: > I had this problem back in 2009 when building Enthought Enable, and > was happy with a work around. It just bit me again, and I finally got > around to drilling down to the problem. > > > On linux, if one uses the numpy/site.cfg [default] section when > building from source to specifiy local library directories, the x11 > libs won't be found by NumPy. > > > The relevant section of the site.cfg.example reads as follows: > > > # Defaults > # ======== > # The settings given here will apply to all other sections if not > overridden. > # This is a good place to add general library and include directories > like > # /usr/local/{lib,include} > # > #[DEFAULT] > #library_dirs = /usr/local/lib > #include_dirs = /usr/local/include > > > Now, I build NumPy with Atlas and my Atlas libs are installed > in /usr/local, so my [default] section of site.cfg looks like this (as > suggested by the site.cfg.example): > > > # Defaults > # ======== > # The settings given here will apply to all other sections if not > overridden. > # This is a good place to add general library and include directories > like > # /usr/local/{lib,include} > # > [DEFAULT] > library_dirs = /usr/local/lib:/usr/local/lib/atlas > include_dirs = /usr/local/include > > > > > NumPy builds and works fine with this. The problem occurs when other > libraries use numpy.distutils.system_info.get_info('x11') (ala > Enthought Enable). That function eventually calls > numpy.distutils.system_info.system_info.parse_config_files which has > the following definition: > > > def parse_config_files(self): > self.cp.read(self.files) > if not self.cp.has_section(self.section): > if self.section is not None: > self.cp.add_section(self.section) > > > When self.cp is instantiated (when looking for the x11 libs), it is > provided the following defaults: > > > {'libraries': '', 'src_dirs': '.:/usr/local/src', > 'search_static_first': '0', 'library_dirs': > '/usr/X11R6/lib64:/usr/X11R6/lib:/usr/X11/lib64:/usr/X11/lib:/usr/lib64:/usr/lib', 'include_dirs': '/usr/X11R6/include:/usr/X11/include:/usr/include'} > > > As is clearly seen, the 'library_dirs' contains the proper paths to > find the x11 libs. But since the config file has [default] section, > these paths get trampled and replaced with whatever is contained in > the site.cfg [default] section. In my case, this > is /usr/local/lib:/usr/local/lib/atlas. Thus, my x11 libs aren't found > and the Enable build fails. > > > The workaround is to include an [x11] section in site.cfg with the > appropriate paths, but I don't really feel this should be necessary. > Would the better behavior be to look for a [default] section in the > config file in the parse_config_files method and add those paths to > the already specified defaults? > > > Changing the site.cfg [default] section to read as follows: > > > [DEFAULT] > library_dirs = /usr/lib:/usr/local/lib:/usr/local/lib/atlas > include_dirs = /usr/include:/usr/local/include > > > is not an option because then NumPy will find and use the system > atlas, which in my case is not threaded nor optimized for my machine. > > > If you want me to patch the parse_config_files method, just let me > know. > > > Cheers, > > > Chris > Anyone have thoughts on this? Thinking more about it, I feel the appropriate behavior would be for numpy to prepend everything in the [default] section to its internal default paths, rather than override the internal defaults as it is currently doing. From mihalache at gmail.com Sun May 16 00:03:58 2010 From: mihalache at gmail.com (Gabriel Mihalache) Date: Sun, 16 May 2010 00:03:58 -0400 Subject: [Numpy-discussion] Wrong Eigenvalue (Approximation?) Message-ID: Hello, all! I'm new to Numpy and Python so please tolerate by ignorance on this but I'm having problems with some weird behavior. Consider the session: >>> import numpy as np >>> import numpy.linalg as la >>> x = np.array([[0.3, 0.2, 0.5], [0.2, 0.1, 0.7], [0.9, 0.05, 0.05]]).transpose() >>> esystem = la.eig(x) >>> esystem[0] array([ 1. ?, -0.3 , -0.25]) >>> esystem[0][0] 1.0000000000000004 The eigenvalue should be 1 exactly. In fact, later on I want to be able to do np.where(x == 1) which fails. The way I set up the matrix, I know for sure that there must be one eigenvalue exactly equal to 1. Any help is greatly appreciated! This is all on Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)] on win32 Regards, Gabriel From ben.root at ou.edu Sun May 16 00:36:42 2010 From: ben.root at ou.edu (Benjamin Root) Date: Sat, 15 May 2010 23:36:42 -0500 Subject: [Numpy-discussion] Wrong Eigenvalue (Approximation?) In-Reply-To: References: Message-ID: On Sat, May 15, 2010 at 11:03 PM, Gabriel Mihalache wrote: > Hello, all! I'm new to Numpy and Python so please tolerate by > ignorance on this but I'm having problems with some weird behavior. > Consider the session: > > >>> import numpy as np > >>> import numpy.linalg as la > >>> x = np.array([[0.3, 0.2, 0.5], [0.2, 0.1, 0.7], [0.9, 0.05, > 0.05]]).transpose() > >>> esystem = la.eig(x) > >>> esystem[0] > array([ 1. , -0.3 , -0.25]) > >>> esystem[0][0] > 1.0000000000000004 > This seems correct to me. Floating point calculations in any language is not exact because of issues with how decimal numbers are stored in binary. Therefore... > > The eigenvalue should be 1 exactly. In fact, later on I want to be able to > do > > np.where(x == 1) > > > which fails. > is entirely expected. Trying to perform equality comparisons between floating point numbers is almost always doomed to failure, no matter which language you choose. There are plenty of resources on the internet about this, and it is very common to interpret as a bug by newcomers to scientific computing. I hope this helps. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From lumtegis at gmail.com Sun May 16 07:50:56 2010 From: lumtegis at gmail.com (lumtegis at gmail.com) Date: Sun, 16 May 2010 11:50:56 +0000 Subject: [Numpy-discussion] Array input Message-ID: <1054.958477856@gmail.com> Hi all, Am creating a script to do least square adjustment of levelling data. How do I get user input into a 1D array of intergers. Thanks in advance. Allan maungu. From david.verelst at gmail.com Sun May 16 05:34:36 2010 From: david.verelst at gmail.com (David Verelst) Date: Sun, 16 May 2010 11:34:36 +0200 Subject: [Numpy-discussion] Array input In-Reply-To: <1054.958477856@gmail.com> References: <1054.958477856@gmail.com> Message-ID: Hi Allen, If you google on "python user input" you already have your answer... for instance: http://en.wikibooks.org/wiki/Python_Programming/Input_and_output Hope this helps, David > Hi all, > Am creating a script to do least square adjustment of levelling data. How do I get user input into a 1D array of intergers. ?Thanks in advance. > Allan maungu. From aisaac at american.edu Sun May 16 08:13:07 2010 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 16 May 2010 08:13:07 -0400 Subject: [Numpy-discussion] Wrong Eigenvalue (Approximation?) In-Reply-To: References: Message-ID: <4BEFE153.2020900@american.edu> On 5/16/2010 12:03 AM, Gabriel Mihalache wrote: > The eigenvalue should be 1 exactly. http://floating-point-gui.de/ hth, Alan Isaac From gokhansever at gmail.com Sun May 16 10:36:59 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Sun, 16 May 2010 09:36:59 -0500 Subject: [Numpy-discussion] Wrong Eigenvalue (Approximation?) In-Reply-To: <4BEFE153.2020900@american.edu> References: <4BEFE153.2020900@american.edu> Message-ID: Floating point numbers; one of my recent favorite subjects... See this hot Slashdot discussion subject: what every programmer should know about floating-point arithmetic On 5/16/10, Alan G Isaac wrote: > On 5/16/2010 12:03 AM, Gabriel Mihalache wrote: >> The eigenvalue should be 1 exactly. > > http://floating-point-gui.de/ > > hth, > Alan Isaac > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- G?khan From lasagnadavide at gmail.com Sun May 16 15:14:34 2010 From: lasagnadavide at gmail.com (Davide Lasagna) Date: Sun, 16 May 2010 21:14:34 +0200 Subject: [Numpy-discussion] faster code Message-ID: Hi all, What is the fastest and lowest memory consumption way to compute this? y = np.arange(2**24) bases = y[1:] + y[:-1] Actually it is already quite fast, but i'm not sure whether it is occupying some temporary memory is the summation. Any help is appreciated. Cheers Davide -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sun May 16 15:24:56 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 16 May 2010 12:24:56 -0700 Subject: [Numpy-discussion] faster code In-Reply-To: References: Message-ID: On Sun, May 16, 2010 at 12:14 PM, Davide Lasagna wrote: > Hi all, > What is the fastest and lowest memory consumption way to compute this? > y = np.arange(2**24) > bases?= y[1:] + y[:-1] > Actually it is already quite fast, but i'm not sure whether it is occupying > some temporary memory > is the summation. Any help is appreciated. Is it OK to modify y? If so: >> y = np.arange(2**24) >> z = y[1:] + y[:-1] # <--- Slow way >> y[:-1] += y[1:] # <--- Fast way >> (y[:-1] == z).all() True From efiring at hawaii.edu Sun May 16 16:18:47 2010 From: efiring at hawaii.edu (Eric Firing) Date: Sun, 16 May 2010 10:18:47 -1000 Subject: [Numpy-discussion] faster code In-Reply-To: References: Message-ID: <4BF05327.2000002@hawaii.edu> On 05/16/2010 09:24 AM, Keith Goodman wrote: > On Sun, May 16, 2010 at 12:14 PM, Davide Lasagna > wrote: >> Hi all, >> What is the fastest and lowest memory consumption way to compute this? >> y = np.arange(2**24) >> bases = y[1:] + y[:-1] >> Actually it is already quite fast, but i'm not sure whether it is occupying >> some temporary memory >> is the summation. Any help is appreciated. > > Is it OK to modify y? If so: > >>> y = np.arange(2**24) >>> z = y[1:] + y[:-1] #<--- Slow way >>> y[:-1] += y[1:] #<--- Fast way >>> (y[:-1] == z).all() > True It's not faster on my machine, as timed with ipython: In [8]:y = np.arange(2**24) In [9]:b = np.array([1,1], dtype=int) In [10]:timeit np.convolve(y, b, 'valid') 1 loops, best of 3: 484 ms per loop In [11]:timeit y[1:] + y[:-1] 10 loops, best of 3: 181 ms per loop In [12]:timeit y[:-1] += y[1:] 10 loops, best of 3: 183 ms per loop If we include the fake data generation in the timing, to reduce cache bias in the repeated runs, the += method is noticeably slower. In [13]:timeit y = np.arange(2**24); z = y[1:] + y[:-1] 1 loops, best of 3: 297 ms per loop In [14]:timeit y = np.arange(2**24); y[:-1] += y[1:]; z = y[:-1] 1 loops, best of 3: 322 ms per loop Eric From gael.varoquaux at normalesup.org Sun May 16 16:37:28 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 16 May 2010 22:37:28 +0200 Subject: [Numpy-discussion] [sympy] EuroScipy abstract submission deadline extended Message-ID: <20100516203728.GJ19278@phare.normalesup.org> Given that we have been able to turn on registration only very late, the EuroScipy conference committee is extending the deadline for abstract submission for the 2010 EuroScipy conference. On Thursday May 20th, at midnight Samoa time, we will turn off the abstract submission on the conference site. Up to then, you can modify the already-submitted abstract, or submit new abstracts. We are very much looking forward to your submissions to the conference. Ga?l Varoquaux Nicolas Chauvat -- EuroScipy 2010 is the annual European conference for scientists using Python. It will be held July 8-11 2010, in ENS, Paris, France. Links: Conference website: http://www.euroscipy.org/conference/euroscipy2010 Call for papers: http://www.euroscipy.org/card/euroscipy2010_call_for_papers Practical information: http://www.euroscipy.org/card/euroscipy2010_practical_information From bpederse at gmail.com Sun May 16 16:53:47 2010 From: bpederse at gmail.com (Brent Pedersen) Date: Sun, 16 May 2010 13:53:47 -0700 Subject: [Numpy-discussion] faster code In-Reply-To: References: Message-ID: On Sun, May 16, 2010 at 12:14 PM, Davide Lasagna wrote: > Hi all, > What is the fastest and lowest memory consumption way to compute this? > y = np.arange(2**24) > bases?= y[1:] + y[:-1] > Actually it is already quite fast, but i'm not sure whether it is occupying > some temporary memory > is the summation. Any help is appreciated. > Cheers > Davide > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > how about something like this? may have off-by-1 somewhere. >>> bases = np.arange(1, 2*2**24-1, 2) From lasagnadavide at gmail.com Sun May 16 16:57:14 2010 From: lasagnadavide at gmail.com (Davide Lasagna) Date: Sun, 16 May 2010 22:57:14 +0200 Subject: [Numpy-discussion] faster code In-Reply-To: References: Message-ID: Well, actually np.arange(2**24) was just to test the following line ;). I'm particularly concerned about memory consumption rather than speed. On 16 May 2010 22:53, Brent Pedersen wrote: > On Sun, May 16, 2010 at 12:14 PM, Davide Lasagna > wrote: > > Hi all, > > What is the fastest and lowest memory consumption way to compute this? > > y = np.arange(2**24) > > bases = y[1:] + y[:-1] > > Actually it is already quite fast, but i'm not sure whether it is > occupying > > some temporary memory > > is the summation. Any help is appreciated. > > Cheers > > Davide > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > how about something like this? may have off-by-1 somewhere. > > >>> bases = np.arange(1, 2*2**24-1, 2) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.tollerud at gmail.com Sun May 16 18:13:16 2010 From: erik.tollerud at gmail.com (Erik Tollerud) Date: Sun, 16 May 2010 15:13:16 -0700 Subject: [Numpy-discussion] newbie: convert recarray to floating-point ndarray with mixed types In-Reply-To: <4BEB90B0.1080909@hawaii.edu> References: <1D673F86DDA00841A1216F04D1CE70D64183757513@EXCH2.nws.oregonstate.edu> <4BEB90B0.1080909@hawaii.edu> Message-ID: If you want to do it in just one line (the third line below), this seems to work - unless you have zillions of types in the structured array it should be plenty fast, too: >>> import numpy as np >>> A = np.array([(1.0, 2), (3.0, 4)], dtype=[('x', float), ('y', int)]) >>> array([A[n] for n in A.dtype.names],dtype=float).T array([[1., 2.], [3., 4.]]) You may or may not want the transpose depending on which way you meant to have the matrix aligned... On Wed, May 12, 2010 at 10:40 PM, Eric Firing wrote: > On 05/12/2010 12:37 PM, Gregory, Matthew wrote: >> Apologies for what is likely a simple question and I hope it hasn't been asked before ... >> >> Given a recarray with a dtype consisting of more than one type, e.g. >> >> ? ?>>> ?import numpy as n >> ? ?>>> ?a = n.array([(1.0, 2), (3.0, 4)], dtype=[('x', float), ('y', int)]) >> ? ?>>> ?b = a.view(n.recarray) >> ? ?>>> ?b >> ? ?rec.array([(1.0, 2), (3.0, 4)], >> ? ? ? ? ?dtype=[('x', '> >> Is there a simple way to convert 'b' to a floating-point ndarray, casting the integer field to a floating-point? ?I've tried the na?ve: >> >> ? ?>>> ?c = b.view(dtype='float').reshape(b.size,-1) >> >> but that fails with: >> >> ? ?ValueError: new type not compatible with array. >> >> I understand why this would fail (as it is a view and not a copy), but I'm lost on a method to do this conversion simply. >> > > It may not be as simple as you would like, but the following works > efficiently: > > import numpy as np > a = np.array([(1.0, 2), (3.0, 4)], dtype=[('x', float), ('y', int)]) > b = np.empty((a.shape[0], 2), dtype=np.float) > b[:,0] = a['x'] > b[:,1] = a['y'] > > Eric > > > >> thanks, matt > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Erik Tollerud http://ps.uci.edu/~etolleru From kwgoodman at gmail.com Sun May 16 18:29:21 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 16 May 2010 15:29:21 -0700 Subject: [Numpy-discussion] faster code In-Reply-To: <4BF05327.2000002@hawaii.edu> References: <4BF05327.2000002@hawaii.edu> Message-ID: On Sun, May 16, 2010 at 1:18 PM, Eric Firing wrote: > On 05/16/2010 09:24 AM, Keith Goodman wrote: >> On Sun, May 16, 2010 at 12:14 PM, Davide Lasagna >> ?wrote: >>> Hi all, >>> What is the fastest and lowest memory consumption way to compute this? >>> y = np.arange(2**24) >>> bases = y[1:] + y[:-1] >>> Actually it is already quite fast, but i'm not sure whether it is occupying >>> some temporary memory >>> is the summation. Any help is appreciated. >> >> Is it OK to modify y? If so: >> >>>> y = np.arange(2**24) >>>> z = y[1:] + y[:-1] ?#<--- Slow way >>>> y[:-1] += y[1:] ?#<--- Fast way >>>> (y[:-1] == z).all() >> ? ? True > > > It's not faster on my machine, as timed with ipython: > > In [8]:y = np.arange(2**24) > > In [9]:b = np.array([1,1], dtype=int) > > In [10]:timeit np.convolve(y, b, 'valid') > 1 loops, best of 3: 484 ms per loop > > In [11]:timeit y[1:] + y[:-1] > 10 loops, best of 3: 181 ms per loop > > In [12]:timeit y[:-1] += y[1:] > 10 loops, best of 3: 183 ms per loop > > If we include the fake data generation in the timing, to reduce cache > bias in the repeated runs, the += method is noticeably slower. > > In [13]:timeit y = np.arange(2**24); z = y[1:] + y[:-1] > 1 loops, best of 3: 297 ms per loop > > In [14]:timeit y = np.arange(2**24); y[:-1] += y[1:]; z = y[:-1] > 1 loops, best of 3: 322 ms per loop That's interesting. On my computer it is faster: >> timeit y = np.arange(2**24); z = y[1:] + y[:-1] 10 loops, best of 3: 144 ms per loop >> timeit y = np.arange(2**24); y[:-1] += y[1:]; z = y[:-1] 10 loops, best of 3: 114 ms per loop What accounts for the performance difference? Cache size? I assume the in-place version uses less memory. Neat if timeit reported memory usage. I haven't tried numexp, that might be something to try too. From gael.varoquaux at normalesup.org Thu May 13 09:31:24 2010 From: gael.varoquaux at normalesup.org (=?ISO-8859-1?Q?Ga=EBl_Varoquaux?=) Date: Thu, 13 May 2010 15:31:24 +0200 Subject: [Numpy-discussion] EuroScipy is finally open for registration Message-ID: The registration for EuroScipyis finally open. To register, go to the website, create an account, and you will see a *?register to the conference?* button on the left. Follow it to a page which presents a *?shoping cart?*. Simply submitting this information registers you to the conference, and on the left of the website, the button will now display *?You are registered for the conference?*. The registration fee is 50 euros for the conference, and 50 euros for the tutorial. Right now there is no payment system: you will be contacted later (in a week) with instructions for paying. We apologize for such a late set up. We do realize this has come as an inconvenience to people. *Do not wait to register: the number of people we can host is limited.* An exciting program Tutorials: from beginners to experts We have two tutorial tracks: - *Introductory tutorial* : to get you to speed on scientific programming with Python. - *Advanced tutorial* : experts sharing their knowledge on specific techniques and libraries. We are very fortunate to have a top notch set of presenters. Scientific track: doing new science in Python Although the abstract submission is not yet over, We can say that we are going to have a rich set of talks, looking at the current submissions. In addition to the contributed talks, we have: - *Keynote speakers* : Hans Petter Langtangen and Konrard Hinsen, two major player of scientific computing in Python. - *Lightning talks* : one hour will be open for people to come up and present in a flash an interesting project. Publishing papers We are talking with the editors of a major scientific computing journal, and the odds are quite high that we will be able to publish a special issue on scientific computing in Python based on the proceedings of the conference. The papers will undergo peer-review independently from the conference, to ensure high quality of the final publication. Call for papers Abstract submission is still open, though not for long. We are soliciting contributions on scientific libraries and tools developed with Python and on scientific or engineering achievements using Python. These include applications, teaching, future development directions, and current research. See the call for papers . *We are very much looking forward to passionate discussions about Python in science in Paris* *Nicolas Chauvat and Ga?l Varoquaux* -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat May 15 18:40:12 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 16 May 2010 00:40:12 +0200 Subject: [Numpy-discussion] EuroScipy abstract submission deadline extended Message-ID: <20100515224012.GC19412@phare.normalesup.org> Given that we have been able to turn on registration only very late, the EuroScipy conference committee is extending the deadline for abstract submission for the 2010 EuroScipy conference. On Thursday May 20th, at midnight Samoa time, we will turn off the abstract submission on the conference site. Up to then, you can modify the already-submitted abstract, or submit new abstracts. We are very much looking forward to your submissions to the conference. Ga?l Varoquaux Nicolas Chauvat -- EuroScipy 2010 is the annual conference for scientists using Python. It will be held July 8-11 2010, in ENS, Paris, France. Links: Conference website: http://www.euroscipy.org/conference/euroscipy2010 Call for papers: http://www.euroscipy.org/card/euroscipy2010_call_for_papers Practical information: http://www.euroscipy.org/card/euroscipy2010_practical_information From boogaloojb at yahoo.fr Mon May 17 07:03:19 2010 From: boogaloojb at yahoo.fr (Jean-Baptiste Rudant) Date: Mon, 17 May 2010 11:03:19 +0000 (GMT) Subject: [Numpy-discussion] Saving an array on disk to free memory - Pickling Message-ID: <360513.64605.qm@web28503.mail.ukl.yahoo.com> Hello, I tried to create an object : - which behave just like a numpy array ; - which can be saved on disk in an efficient way (numpy.save in my example but with pytables in my real program) ; - which can be "unloaded" (if it is saved) to free memory : it can exsit has an empty stuff which knows how to retrieve real values ; it will be loaded only when we need to work with it ; - which unloads itself before being pickled (values are already saved and don't have to be pickled). It can't, at least I think so, inherit from ndarray because sometimes (for example juste after being unpickled and before being used) it is juste an empty shell. I don't think memmap can be helpful (I want to use pytables to save it on disk and I want it to be flexible : if I use it in a temporary way, I just need it in memory and I will never save it on disk). My problems are : - this code is ugly ; - I have to define explicitely all special methods (__add__, __mul__...) of ndarrays because: * __getattr__ don't retrieve them ; * even if it does, I have to define explicitely the type of the return value (if I well understand, if it inherits from ndarray __array_wrap__ do all the stuff). Thank you for the help. Regards. import numpy import numpy class PersistentArray(object): def __init__(self, values): ''' values is a numpy array ''' self.values = values self.filename = None self.is_loaded = True self.is_saved = False def save(self, filename): self.filename = filename numpy.save(self.filename, self.values) self.is_saved = True def load(self): self.values = numpy.load(self.filename) self.is_loaded = True def unload(self): if not self.is_saved: raise Exception, "PersistentArray must be saved before being unloaded" del self.values self.is_loaded = False def __getitem__(self, index): return self.values[index] def __getattr__(self, key): if key == 'values': if not self.is_loaded: self.load() return self.values elif key == '__array_interface__': #I can't remember why I wrote this code, but I think it's necessary to make pickling work properly raise AttributeError, key else: try: #to emulate ndarray inheritance return self.values.__getattribute__(key) except AttributeError: raise AttributeError, key def __setstate__(self, dict): self.__dict__.update(dict) if self.is_loaded and self.is_saved: self.load() def __getstate__(self): if not self.is_saved: raise Exception, "persistent array must be saved before being pickled" odict = self.__dict__.copy() if self.is_saved: if self.is_loaded: odict['is_loaded'] = False del odict['values'] return odict filename = 'persistent_test.npy' a = PersistentArray(numpy.arange(10e6)) a.save(filename) a.sum() a.unload() # a still exists, knows how to retrieve values if needed, but don't use space in memory -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Mon May 17 12:24:12 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 17 May 2010 19:24:12 +0300 Subject: [Numpy-discussion] Saving an array on disk to free memory - Pickling References: <360513.64605.qm@web28503.mail.ukl.yahoo.com> Message-ID: <710F2847B0018641891D9A21602763605AD406@ex3.envision.co.il> Is a memory mapped file is a viable solution to your problem? Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Jean-Baptiste Rudant Sent: Mon 17-May-10 14:03 To: Numpy Discussion Subject: [Numpy-discussion] Saving an array on disk to free memory - Pickling Hello, I tried to create an object : - which behave just like a numpy array ; - which can be saved on disk in an efficient way (numpy.save in my example but with pytables in my real program) ; - which can be "unloaded" (if it is saved) to free memory : it can exsit has an empty stuff which knows how to retrieve real values ; it will be loaded only when we need to work with it ; - which unloads itself before being pickled (values are already saved and don't have to be pickled). It can't, at least I think so, inherit from ndarray because sometimes (for example juste after being unpickled and before being used) it is juste an empty shell. I don't think memmap can be helpful (I want to use pytables to save it on disk and I want it to be flexible : if I use it in a temporary way, I just need it in memory and I will never save it on disk). My problems are : - this code is ugly ; - I have to define explicitely all special methods (__add__, __mul__...) of ndarrays because: * __getattr__ don't retrieve them ; * even if it does, I have to define explicitely the type of the return value (if I well understand, if it inherits from ndarray __array_wrap__ do all the stuff). Thank you for the help. Regards. import numpy import numpy class PersistentArray(object): def __init__(self, values): ''' values is a numpy array ''' self.values = values self.filename = None self.is_loaded = True self.is_saved = False def save(self, filename): self.filename = filename numpy.save(self.filename, self.values) self.is_saved = True def load(self): self.values = numpy.load(self.filename) self.is_loaded = True def unload(self): if not self.is_saved: raise Exception, "PersistentArray must be saved before being unloaded" del self.values self.is_loaded = False def __getitem__(self, index): return self.values[index] def __getattr__(self, key): if key == 'values': if not self.is_loaded: self.load() return self.values elif key == '__array_interface__': #I can't remember why I wrote this code, but I think it's necessary to make pickling work properly raise AttributeError, key else: try: #to emulate ndarray inheritance return self.values.__getattribute__(key) except AttributeError: raise AttributeError, key def __setstate__(self, dict): self.__dict__.update(dict) if self.is_loaded and self.is_saved: self.load() def __getstate__(self): if not self.is_saved: raise Exception, "persistent array must be saved before being pickled" odict = self.__dict__.copy() if self.is_saved: if self.is_loaded: odict['is_loaded'] = False del odict['values'] return odict filename = 'persistent_test.npy' a = PersistentArray(numpy.arange(10e6)) a.save(filename) a.sum() a.unload() # a still exists, knows how to retrieve values if needed, but don't use space in memory -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 4245 bytes Desc: not available URL: From jsseabold at gmail.com Mon May 17 13:22:18 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 17 May 2010 13:22:18 -0400 Subject: [Numpy-discussion] savetxt not working with python3.1 In-Reply-To: <710F2847B0018641891D9A21602763605AD3FD@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD3FD@ex3.envision.co.il> Message-ID: On Thu, May 13, 2010 at 8:18 AM, Nadav Horesh wrote: > > in module npyio.py lines 794,796 "file" should be replaced by "_file" > What version of Numpy? Can you file a bug ticket with a failing example? I couldn't replicated with r8417 on Python 3. Skipper From faltet at pytables.org Mon May 17 14:06:44 2010 From: faltet at pytables.org (Francesc Alted) Date: Mon, 17 May 2010 20:06:44 +0200 Subject: [Numpy-discussion] faster code In-Reply-To: References: Message-ID: <201005172006.44839.faltet@pytables.org> A Sunday 16 May 2010 21:14:34 Davide Lasagna escrigu?: > Hi all, > > What is the fastest and lowest memory consumption way to compute this? > > y = np.arange(2**24) > bases = y[1:] + y[:-1] > > Actually it is already quite fast, but i'm not sure whether it is occupying > some temporary memory > is the summation. Any help is appreciated. Both y[1:] and y[:-1] are views of the original y array, so you are not wasting temporary space here. So, as I see this, the above idiom is as efficient as it can get in terms of memory usage. -- Francesc Alted From kwgoodman at gmail.com Mon May 17 14:11:28 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 17 May 2010 11:11:28 -0700 Subject: [Numpy-discussion] faster code In-Reply-To: <201005172006.44839.faltet@pytables.org> References: <201005172006.44839.faltet@pytables.org> Message-ID: On Mon, May 17, 2010 at 11:06 AM, Francesc Alted wrote: > A Sunday 16 May 2010 21:14:34 Davide Lasagna escrigu?: >> Hi all, >> >> What is the fastest and lowest memory consumption way to compute this? >> >> y = np.arange(2**24) >> bases = y[1:] + y[:-1] >> >> Actually it is already quite fast, but i'm not sure whether it is occupying >> some temporary memory >> is the summation. Any help is appreciated. > > Both y[1:] and y[:-1] are views of the original y array, so you are not > wasting temporary space here. ?So, as I see this, the above idiom is as > efficient as it can get in terms of memory usage. I thought that this y[:-1] += y[1:] uses half the memory of this bases = y[1:] + y[:-1] From faltet at pytables.org Mon May 17 14:13:35 2010 From: faltet at pytables.org (Francesc Alted) Date: Mon, 17 May 2010 20:13:35 +0200 Subject: [Numpy-discussion] Saving an array on disk to free memory - Pickling In-Reply-To: <360513.64605.qm@web28503.mail.ukl.yahoo.com> References: <360513.64605.qm@web28503.mail.ukl.yahoo.com> Message-ID: <201005172013.35754.faltet@pytables.org> A Monday 17 May 2010 13:03:19 Jean-Baptiste Rudant escrigu?: > Hello, > > I tried to create an object : > - which behave just like a numpy array ; > - which can be saved on disk in an efficient way (numpy.save in my example > but with pytables in my real program) ; - which can be "unloaded" (if it > is saved) to free memory : it can exsit has an empty stuff which knows how > to retrieve real values ; it will be loaded only when we need to work with > it ; - which unloads itself before being pickled (values are already saved > and don't have to be pickled). [clip] Well, if you are using Linux, you can make use of /dev/shm in order to save your files in-memory instead of disk. There are some considerations to have in mind for doing this though: http://superuser.com/questions/45342/when-should-i-use-dev-shm-and-when- should-i-use-tmp However, I don't know the equivalent to this in Win, Mac OSX or other UNICES. -- Francesc Alted From faltet at pytables.org Mon May 17 14:18:57 2010 From: faltet at pytables.org (Francesc Alted) Date: Mon, 17 May 2010 20:18:57 +0200 Subject: [Numpy-discussion] faster code In-Reply-To: References: <201005172006.44839.faltet@pytables.org> Message-ID: <201005172018.57965.faltet@pytables.org> A Monday 17 May 2010 20:11:28 Keith Goodman escrigu?: > On Mon, May 17, 2010 at 11:06 AM, Francesc Alted wrote: > > A Sunday 16 May 2010 21:14:34 Davide Lasagna escrigu?: > >> Hi all, > >> > >> What is the fastest and lowest memory consumption way to compute this? > >> > >> y = np.arange(2**24) > >> bases = y[1:] + y[:-1] > >> > >> Actually it is already quite fast, but i'm not sure whether it is > >> occupying some temporary memory > >> is the summation. Any help is appreciated. > > > > Both y[1:] and y[:-1] are views of the original y array, so you are not > > wasting temporary space here. So, as I see this, the above idiom is as > > efficient as it can get in terms of memory usage. > > I thought that this > > y[:-1] += y[1:] > > uses half the memory of this > > bases = y[1:] + y[:-1] Indeed. But that way you are altering the contents of the original y array and I'm not sure if this is what the OP wanted. -- Francesc Alted From pav at iki.fi Mon May 17 15:12:59 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 17 May 2010 19:12:59 +0000 (UTC) Subject: [Numpy-discussion] savetxt not working with python3.1 References: <710F2847B0018641891D9A21602763605AD3FD@ex3.envision.co.il> Message-ID: Mon, 17 May 2010 13:22:18 -0400, Skipper Seabold wrote: [clip] > What version of Numpy? Can you file a bug ticket with a failing > example? I couldn't replicated with r8417 on Python 3. It was fixed in r8411. -- Pauli Virtanen From seb.haase at gmail.com Mon May 17 15:58:54 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Mon, 17 May 2010 21:58:54 +0200 Subject: [Numpy-discussion] Wrong Eigenvalue (Approximation?) In-Reply-To: <4BEFE153.2020900@american.edu> References: <4BEFE153.2020900@american.edu> Message-ID: On Sun, May 16, 2010 at 2:13 PM, Alan G Isaac wrote: > On 5/16/2010 12:03 AM, Gabriel Mihalache wrote: >> The eigenvalue should be 1 exactly. > > http://floating-point-gui.de/ > Hi, just wondering why that site you just referred doesn't say who the author is !? It looks very nice (i.e. pleasing to the eye...) - Sebastian Haase From boogaloojb at yahoo.fr Tue May 18 02:57:47 2010 From: boogaloojb at yahoo.fr (Jean-Baptiste Rudant) Date: Tue, 18 May 2010 06:57:47 +0000 (GMT) Subject: [Numpy-discussion] Re : Saving an array on disk to free memory - Pickling In-Reply-To: <201005172013.35754.faltet@pytables.org> References: <360513.64605.qm@web28503.mail.ukl.yahoo.com> <201005172013.35754.faltet@pytables.org> Message-ID: <830138.25838.qm@web28503.mail.ukl.yahoo.com> Thank you very much for the help. But I was more looking for some coding solution (furthermore, I'm not using Linux). My point in not to make some real arrays looking like they are saved on files (and use for it some files in memory), but at the contrary, to make some "fake" arrays, saved on disk, to look like real arrays. In other words, to make them behave like if they were inherited from ndarrays. In pytables, if you use my_node[:] it returns an array. It's part of what I want to do. Plus : - if I call my_node[:] ten times, only the first call will read to disk (it matters, even if pytables is very very fast) ; - the same class can represent a node or a numpy array. Jean-Baptiste Rudant ________________________________ De : Francesc Alted ? : Discussion of Numerical Python Envoy? le : Lun 17 mai 2010, 20h 13min 35s Objet : Re: [Numpy-discussion] Saving an array on disk to free memory - Pickling A Monday 17 May 2010 13:03:19 Jean-Baptiste Rudant escrigu?: > Hello, > > I tried to create an object : > - which behave just like a numpy array ; > - which can be saved on disk in an efficient way (numpy.save in my example > but with pytables in my real program) ; - which can be "unloaded" (if it > is saved) to free memory : it can exsit has an empty stuff which knows how > to retrieve real values ; it will be loaded only when we need to work with > it ; - which unloads itself before being pickled (values are already saved > and don't have to be pickled). [clip] Well, if you are using Linux, you can make use of /dev/shm in order to save your files in-memory instead of disk. There are some considerations to have in mind for doing this though: http://superuser.com/questions/45342/when-should-i-use-dev-shm-and-when- should-i-use-tmp However, I don't know the equivalent to this in Win, Mac OSX or other UNICES. -- Francesc Alted _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From wright at esrf.fr Tue May 18 03:33:32 2010 From: wright at esrf.fr (Jon Wright) Date: Tue, 18 May 2010 09:33:32 +0200 Subject: [Numpy-discussion] Wrong Eigenvalue (Approximation?) In-Reply-To: References: <4BEFE153.2020900@american.edu> Message-ID: <4BF242CC.4010800@esrf.fr> Sebastian Haase wrote: > On Sun, May 16, 2010 at 2:13 PM, Alan G Isaac wrote: >> On 5/16/2010 12:03 AM, Gabriel Mihalache wrote: >>> The eigenvalue should be 1 exactly. >> http://floating-point-gui.de/ >> > Hi, just wondering why that site you just referred doesn't say who the > author is !? > It looks very nice (i.e. pleasing to the eye...) Very nice indeed! Top right link "fork me at github" leads you to the author, Michael Borgwardt. From faltet at pytables.org Tue May 18 03:42:03 2010 From: faltet at pytables.org (Francesc Alted) Date: Tue, 18 May 2010 09:42:03 +0200 Subject: [Numpy-discussion] Re : Saving an array on disk to free memory - Pickling In-Reply-To: <830138.25838.qm@web28503.mail.ukl.yahoo.com> References: <360513.64605.qm@web28503.mail.ukl.yahoo.com> <201005172013.35754.faltet@pytables.org> <830138.25838.qm@web28503.mail.ukl.yahoo.com> Message-ID: <201005180942.03538.faltet@pytables.org> A Tuesday 18 May 2010 08:57:47 Jean-Baptiste Rudant escrigu?: > Thank you very much for the help. > > But I was more looking for some coding solution (furthermore, I'm not using > Linux). My point in not to make some real arrays looking like they are > saved on files (and use for it some files in memory), but at the contrary, > to make some "fake" arrays, saved on disk, to look like real arrays. In > other words, to make them behave like if they were inherited from > ndarrays. > > In pytables, if you use my_node[:] it returns an array. It's part of what I > want to do. Plus : > - if I call my_node[:] ten times, only the first call will read to disk > (it matters, even if pytables is very very fast) ; Well, the second time that you read from disk, pytables will also will read from memory instead. Indeed, it is the OS filesystem cache who will do the trick, and the speed may be not exactly the same as a pure numpy array, but hey, it should be very close and with zero implementation cost. Moreover, if you have to do arithmetic computations with your arrays, you can make use of `tables.Expr()` module that can perform them generally faster than using pure numpy (for example, see http://pytables.org/moin/ComputingKernel). -- Francesc Alted From lutz.maibaum at gmail.com Tue May 18 13:43:15 2010 From: lutz.maibaum at gmail.com (Lutz Maibaum) Date: Tue, 18 May 2010 10:43:15 -0700 Subject: [Numpy-discussion] Converting None to NULL using ndpointer Message-ID: Hello, I am trying to use a C library function from Python using numpy and ctypes. Assume the function has the signature void foo(int* bar) so it usually takes an integer array as a parameter. In python, I define foo.argtypes=[ndpointer(dtype="intc",flags="C_CONTIGUOUS")] and can then call the function foo(x), where x is an appropriate ndarray. This works very well. The problem is that I need to be able to call the function foo with a NULL pointer as an argument. Ideally, I would like to be able to use foo(None), but that doesn't work because the argument conversion throws a TypeError. Therefore my questions: 1. Is there a ndarray that is converted to NULL by ndpointer? I was hoping that maybe ndarray(None) or an array of size 0 would work, but they don't. 2. Is there a way to tell ndpointer that None is an acceptable parameter, which should be converted to NULL? 3. I could also change the argument specification in python to foo.argtypes=[POINTER(c_int)], but then I would lose the type checking features of ndpointer. Can I invoke the ndpointer conversion manually? For example, does a ndpointer object have a method that takes a ndarray as an argument and returns a POINTER(c_int) if appropriate? Any suggestions would be much appreciated. Thanks, Lutz From wccarithers at lbl.gov Wed May 19 14:03:24 2010 From: wccarithers at lbl.gov (William Carithers) Date: Wed, 19 May 2010 11:03:24 -0700 Subject: [Numpy-discussion] Runtime error in numpy.polyfit Message-ID: I'm trying to do a simple 2nd degree polynomial fit to two arrays of 5 entries. I get a runtime error: RuntimeError: more argument specifiers than keyword list entries (remaining format:'|:calc_lwork.gelss') in the lstsq module inside numpy.polyfit. Here's the code snippet: def findPeak(self, ydex, xdex): # take a vertical slice vslice = [] for i in range(-1,10,1) : vslice.append(arcImage[ydex+i][xdex]) vdex = n.where(vslice == max(vslice)) ymax = ydex -1 + vdex[0][0] # approximate gaussian fit by parabolic fit to logs yvalues = n.array([ymax-2, ymax-1, ymax, ymax+1, ymax+2]) svalues=n.array([arcImage[ymax-2][xdex],arcImage[ymax-1][xdex],arcImage[ymax ][xdex],arcImage[ymax+1][xdex], arcImage[ymax+2][xdex]]) avalues = n.log(svalues) ypoly = n.polyfit(yvalues, avalues, 2) And the traceback: File "/Users/williamcarithers/BOSS/src/calibrationModel.py", line 345, in findPeak ypoly = n.polyfit(yvalues, avalues, 2) File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/ numpy/lib/polynomial.py", line 503, in polyfit c, resids, rank, s = _lstsq(v, y, rcond) File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/ numpy/lib/polynomial.py", line 46, in _lstsq return lstsq(X, y, rcond) File "/Library/Python/2.6/site-packages/scipy-0.7.1-py2.6-macosx-10.6-universal.e gg/scipy/linalg/basic.py", line 545, in lstsq lwork = calc_lwork.gelss(gelss.prefix,m,n,nrhs)[1] RuntimeError: more argument specifiers than keyword list entries (remaining format:'|:calc_lwork.gelss') This is such a simple application of polyfit and the error occurs in the guts of lstsq, so I'm completely stumped. Any help would be greatly appreciated. Thanks, Bill Carithers From d.l.goldsmith at gmail.com Wed May 19 15:18:44 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Wed, 19 May 2010 12:18:44 -0700 Subject: [Numpy-discussion] Runtime error in numpy.polyfit In-Reply-To: References: Message-ID: Charles H.: is this happening because he's calling the old version of polyfit? William: try using numpy.polynomial.polyfit instead, see if that works. DG On Wed, May 19, 2010 at 11:03 AM, William Carithers wrote: > I'm trying to do a simple 2nd degree polynomial fit to two arrays of 5 > entries. I get a runtime error: > RuntimeError: more argument specifiers than keyword list entries (remaining > format:'|:calc_lwork.gelss') in the lstsq module inside numpy.polyfit. > > Here's the code snippet: > def findPeak(self, ydex, xdex): > # take a vertical slice > vslice = [] > for i in range(-1,10,1) : > vslice.append(arcImage[ydex+i][xdex]) > vdex = n.where(vslice == max(vslice)) > ymax = ydex -1 + vdex[0][0] > # approximate gaussian fit by parabolic fit to logs > yvalues = n.array([ymax-2, ymax-1, ymax, ymax+1, ymax+2]) > > > svalues=n.array([arcImage[ymax-2][xdex],arcImage[ymax-1][xdex],arcImage[ymax > ][xdex],arcImage[ymax+1][xdex], arcImage[ymax+2][xdex]]) > avalues = n.log(svalues) > ypoly = n.polyfit(yvalues, avalues, 2) > > And the traceback: > File "/Users/williamcarithers/BOSS/src/calibrationModel.py", line 345, in > findPeak > ypoly = n.polyfit(yvalues, avalues, 2) > File > > "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/ > numpy/lib/polynomial.py", line 503, in polyfit > c, resids, rank, s = _lstsq(v, y, rcond) > File > > "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/ > numpy/lib/polynomial.py", line 46, in _lstsq > return lstsq(X, y, rcond) > File > > "/Library/Python/2.6/site-packages/scipy-0.7.1-py2.6-macosx-10.6-universal.e > gg/scipy/linalg/basic.py", line 545, in lstsq > lwork = calc_lwork.gelss(gelss.prefix,m,n,nrhs)[1] > RuntimeError: more argument specifiers than keyword list entries (remaining > format:'|:calc_lwork.gelss') > > This is such a simple application of polyfit and the error occurs in the > guts of lstsq, so I'm completely stumped. Any help would be greatly > appreciated. > > Thanks, > Bill Carithers > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves) -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed May 19 15:24:24 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 19 May 2010 15:24:24 -0400 Subject: [Numpy-discussion] Runtime error in numpy.polyfit In-Reply-To: References: Message-ID: On Wed, May 19, 2010 at 3:18 PM, David Goldsmith wrote: > Charles H.: is this happening because he's calling the old version of > polyfit? > > William: try using numpy.polynomial.polyfit instead, see if that works. > > DG > > On Wed, May 19, 2010 at 11:03 AM, William Carithers > wrote: >> >> I'm trying to do a simple 2nd degree polynomial fit to two arrays of 5 >> entries. I get a runtime error: >> RuntimeError: more argument specifiers than keyword list entries >> (remaining >> format:'|:calc_lwork.gelss') ?in the lstsq module inside numpy.polyfit. >> >> Here's the code snippet: >> def findPeak(self, ydex, xdex): >> ? ? ? ?# take a vertical slice >> ? ? ? ?vslice = [] >> ? ? ? ?for i in range(-1,10,1) : >> ? ? ? ? ? ?vslice.append(arcImage[ydex+i][xdex]) >> ? ? ? ?vdex = n.where(vslice == max(vslice)) >> ? ? ? ?ymax = ydex -1 + vdex[0][0] >> ? ? ? ?# approximate gaussian fit by parabolic fit to logs >> ? ? ? ?yvalues = n.array([ymax-2, ymax-1, ymax, ymax+1, ymax+2]) >> >> >> svalues=n.array([arcImage[ymax-2][xdex],arcImage[ymax-1][xdex],arcImage[ymax >> ][xdex],arcImage[ymax+1][xdex], arcImage[ymax+2][xdex]]) >> ? ? ? ?avalues = n.log(svalues) >> ? ? ? ?ypoly = n.polyfit(yvalues, avalues, 2) >> >> And the traceback: >> File "/Users/williamcarithers/BOSS/src/calibrationModel.py", line 345, in >> findPeak >> ? ?ypoly = n.polyfit(yvalues, avalues, 2) >> ?File >> >> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/ >> numpy/lib/polynomial.py", line 503, in polyfit >> ? ?c, resids, rank, s = _lstsq(v, y, rcond) >> ?File >> >> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/ >> numpy/lib/polynomial.py", line 46, in _lstsq >> ? ?return lstsq(X, y, rcond) >> ?File >> >> "/Library/Python/2.6/site-packages/scipy-0.7.1-py2.6-macosx-10.6-universal.e >> gg/scipy/linalg/basic.py", line 545, in lstsq >> ? ?lwork = calc_lwork.gelss(gelss.prefix,m,n,nrhs)[1] >> RuntimeError: more argument specifiers than keyword list entries >> (remaining >> format:'|:calc_lwork.gelss') >> >> This is such a simple application of polyfit and the error occurs in the >> guts of lstsq, so I'm completely stumped. Any help would be greatly >> appreciated. which version of numpy and the arguments to polyfit would be useful information,e.g. print repr(yvalues) print repr(avalues) before the call to polyfit Josef >> >> Thanks, >> Bill Carithers >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Mathematician: noun, someone who disavows certainty when their uncertainty > set is non-empty, even if that set has measure zero. > > Hope: noun, that delusive spirit which escaped Pandora's jar and, with her > lies, prevents mankind from committing a general suicide. ?(As interpreted > by Robert Graves) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From wccarithers at lbl.gov Wed May 19 15:51:36 2010 From: wccarithers at lbl.gov (William Carithers) Date: Wed, 19 May 2010 12:51:36 -0700 Subject: [Numpy-discussion] Runtime error in numpy.polyfit In-Reply-To: Message-ID: Thanks David and Josef. Replies interspersed below. On 5/19/10 12:24 PM, "josef.pktd at gmail.com" wrote: > On Wed, May 19, 2010 at 3:18 PM, David Goldsmith > wrote: >> Charles H.: is this happening because he's calling the old version of >> polyfit? >> >> William: try using numpy.polynomial.polyfit instead, see if that works. It says ypoly = n.polynomial.polyfit(yvalues, avalues, 2) AttributeError: 'module' object has no attribute 'polynomial' Is this because I'm using a relatively old (numpy-1.2.1) version? >> >> DG >> >> On Wed, May 19, 2010 at 11:03 AM, William Carithers >> wrote: >>> >>> I'm trying to do a simple 2nd degree polynomial fit to two arrays of 5 >>> entries. I get a runtime error: >>> RuntimeError: more argument specifiers than keyword list entries >>> (remaining >>> format:'|:calc_lwork.gelss') ?in the lstsq module inside numpy.polyfit. >>> >>> Here's the code snippet: >>> def findPeak(self, ydex, xdex): >>> ? ? ? ?# take a vertical slice >>> ? ? ? ?vslice = [] >>> ? ? ? ?for i in range(-1,10,1) : >>> ? ? ? ? ? ?vslice.append(arcImage[ydex+i][xdex]) >>> ? ? ? ?vdex = n.where(vslice == max(vslice)) >>> ? ? ? ?ymax = ydex -1 + vdex[0][0] >>> ? ? ? ?# approximate gaussian fit by parabolic fit to logs >>> ? ? ? ?yvalues = n.array([ymax-2, ymax-1, ymax, ymax+1, ymax+2]) >>> >>> >>> svalues=n.array([arcImage[ymax-2][xdex],arcImage[ymax-1][xdex],arcImage[ymax >>> ][xdex],arcImage[ymax+1][xdex], arcImage[ymax+2][xdex]]) >>> ? ? ? ?avalues = n.log(svalues) >>> ? ? ? ?ypoly = n.polyfit(yvalues, avalues, 2) >>> >>> And the traceback: >>> File "/Users/williamcarithers/BOSS/src/calibrationModel.py", line 345, in >>> findPeak >>> ? ?ypoly = n.polyfit(yvalues, avalues, 2) >>> ?File >>> >>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/ >>> numpy/lib/polynomial.py", line 503, in polyfit >>> ? ?c, resids, rank, s = _lstsq(v, y, rcond) >>> ?File >>> >>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/ >>> numpy/lib/polynomial.py", line 46, in _lstsq >>> ? ?return lstsq(X, y, rcond) >>> ?File >>> >>> "/Library/Python/2.6/site-packages/scipy-0.7.1-py2.6-macosx-10.6-universal.e >>> gg/scipy/linalg/basic.py", line 545, in lstsq >>> ? ?lwork = calc_lwork.gelss(gelss.prefix,m,n,nrhs)[1] >>> RuntimeError: more argument specifiers than keyword list entries >>> (remaining >>> format:'|:calc_lwork.gelss') >>> >>> This is such a simple application of polyfit and the error occurs in the >>> guts of lstsq, so I'm completely stumped. Any help would be greatly >>> appreciated. > > which version of numpy and the arguments to polyfit would be useful > information,e.g. > > print repr(yvalues) > print repr(avalues) > > before the call to polyfit Hi Josef, I'm using numpy-1.2.1 Here are the arrays array([ 864., 865., 866., 867., 868.]) array([ 5.24860191, 6.0217514 , 6.11434555, 6.09198856, 5.73753977], dtype=float32) thanks > > Josef > > >>> >>> Thanks, >>> Bill Carithers >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> -- >> Mathematician: noun, someone who disavows certainty when their uncertainty >> set is non-empty, even if that set has measure zero. >> >> Hope: noun, that delusive spirit which escaped Pandora's jar and, with her >> lies, prevents mankind from committing a general suicide. ?(As interpreted >> by Robert Graves) >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From dsdale24 at gmail.com Wed May 19 16:08:57 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 19 May 2010 16:08:57 -0400 Subject: [Numpy-discussion] question about creating numpy arrays Message-ID: I have a question about creation of numpy arrays from a list of objects, which bears on the Quantities project and also on masked arrays: >>> import quantities as pq >>> import numpy as np >>> a, b = 2*pq.m,1*pq.s >>> np.array([a, b]) array([ 12., 1.]) Why doesn't that create an object array? Similarly: >>> m = np.ma.array([1], mask=[True]) >>> m masked_array(data = [--], mask = [ True], fill_value = 999999) >>> np.array([m]) array([[1]]) This has broader implications than just creating arrays, for example: >>> np.sum([m, m]) 2 >>> np.sum([a, b]) 13.0 Any thoughts? Thanks, Darren From josef.pktd at gmail.com Wed May 19 16:09:11 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 19 May 2010 16:09:11 -0400 Subject: [Numpy-discussion] Runtime error in numpy.polyfit In-Reply-To: References: Message-ID: On Wed, May 19, 2010 at 3:51 PM, William Carithers wrote: > Thanks David and Josef. Replies interspersed below. > > > On 5/19/10 12:24 PM, "josef.pktd at gmail.com" wrote: > >> On Wed, May 19, 2010 at 3:18 PM, David Goldsmith >> wrote: >>> Charles H.: is this happening because he's calling the old version of >>> polyfit? >>> >>> William: try using numpy.polynomial.polyfit instead, see if that works. > > It says ?ypoly = n.polynomial.polyfit(yvalues, avalues, 2) > AttributeError: 'module' object has no attribute 'polynomial' > > Is this because I'm using a relatively old (numpy-1.2.1) version? >>> >>> DG >>> >>> On Wed, May 19, 2010 at 11:03 AM, William Carithers >>> wrote: >>>> >>>> I'm trying to do a simple 2nd degree polynomial fit to two arrays of 5 >>>> entries. I get a runtime error: >>>> RuntimeError: more argument specifiers than keyword list entries >>>> (remaining >>>> format:'|:calc_lwork.gelss') ?in the lstsq module inside numpy.polyfit. >>>> >>>> Here's the code snippet: >>>> def findPeak(self, ydex, xdex): >>>> ? ? ? ?# take a vertical slice >>>> ? ? ? ?vslice = [] >>>> ? ? ? ?for i in range(-1,10,1) : >>>> ? ? ? ? ? ?vslice.append(arcImage[ydex+i][xdex]) >>>> ? ? ? ?vdex = n.where(vslice == max(vslice)) >>>> ? ? ? ?ymax = ydex -1 + vdex[0][0] >>>> ? ? ? ?# approximate gaussian fit by parabolic fit to logs >>>> ? ? ? ?yvalues = n.array([ymax-2, ymax-1, ymax, ymax+1, ymax+2]) >>>> >>>> >>>> svalues=n.array([arcImage[ymax-2][xdex],arcImage[ymax-1][xdex],arcImage[ymax >>>> ][xdex],arcImage[ymax+1][xdex], arcImage[ymax+2][xdex]]) >>>> ? ? ? ?avalues = n.log(svalues) >>>> ? ? ? ?ypoly = n.polyfit(yvalues, avalues, 2) >>>> >>>> And the traceback: >>>> File "/Users/williamcarithers/BOSS/src/calibrationModel.py", line 345, in >>>> findPeak >>>> ? ?ypoly = n.polyfit(yvalues, avalues, 2) >>>> ?File >>>> >>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/ >>>> numpy/lib/polynomial.py", line 503, in polyfit >>>> ? ?c, resids, rank, s = _lstsq(v, y, rcond) >>>> ?File >>>> >>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/ >>>> numpy/lib/polynomial.py", line 46, in _lstsq >>>> ? ?return lstsq(X, y, rcond) >>>> ?File >>>> >>>> "/Library/Python/2.6/site-packages/scipy-0.7.1-py2.6-macosx-10.6-universal.e >>>> gg/scipy/linalg/basic.py", line 545, in lstsq >>>> ? ?lwork = calc_lwork.gelss(gelss.prefix,m,n,nrhs)[1] >>>> RuntimeError: more argument specifiers than keyword list entries >>>> (remaining >>>> format:'|:calc_lwork.gelss') >>>> >>>> This is such a simple application of polyfit and the error occurs in the >>>> guts of lstsq, so I'm completely stumped. Any help would be greatly >>>> appreciated. >> >> which version of numpy and the arguments to polyfit would be useful >> information,e.g. >> >> print repr(yvalues) >> print repr(avalues) >> >> before the call to polyfit > > Hi Josef, > > I'm using numpy-1.2.1 I don't remember whether 1.2.1 was fully python 2.6 compatible. I would recommend upgrading if possible. I don't have any problems with more recent versions of scipy and numpy >>> import numpy as np >>> y = np.array([ 864., 865., 866., 867., 868.]) >>> x = np.array([ 5.24860191, 6.0217514 , 6.11434555, 6.09198856, 5.73753977],dtype=np.float32) >>> np.polyfit(y, x, 2) array([ -1.69296265e-01, 2.93325942e+02, -1.27049335e+05]) I didn't know numpy will use the scipy version of linalg for this. Do the scipy.test() pass? My guess would be that there are some incompatibilities with your python/numpy/scipy versions. Josef > > Here are the arrays > array([ 864., ?865., ?866., ?867., ?868.]) > array([ 5.24860191, ?6.0217514 , ?6.11434555, ?6.09198856, ?5.73753977], > dtype=float32) > > thanks >> >> Josef >> >> >>>> >>>> Thanks, >>>> Bill Carithers >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> -- >>> Mathematician: noun, someone who disavows certainty when their uncertainty >>> set is non-empty, even if that set has measure zero. >>> >>> Hope: noun, that delusive spirit which escaped Pandora's jar and, with her >>> lies, prevents mankind from committing a general suicide. ?(As interpreted >>> by Robert Graves) >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From wesmckinn at gmail.com Wed May 19 16:18:07 2010 From: wesmckinn at gmail.com (Wes McKinney) Date: Wed, 19 May 2010 16:18:07 -0400 Subject: [Numpy-discussion] [Job] research developer job opening at AQR Capital Message-ID: We are looking for a scientific Python developer to work on the design and implementation of AQR?s proprietary research and production systems. In this role you will work closely with researchers and portfolio managers to identify problems and create solutions addressing speed, scale, and usability. Most of our work goes into building back-testing infrastructure and other tools for quantitative portfolio analysis. As part of this you will become intimately familiar with financial concepts and AQR?s investment philosophy. Requirements ------------ * Degree in computer science, mathematics, or other technical discipline * Experience level: recent graduates up to 5 years * Experience using NumPy and/or SciPy libraries * Effective problem solving and quantitative skills * Well organized, detail oriented and able to focus in a dynamic environment * Ability to discuss and explain involved concepts in both verbal and written form * Able to work independently, as well as part of a team About AQR Capital Management ---------------------------- The founders of AQR were still Ph.D. candidates when they initially developed the complex financial models that are now at AQR?s core. Today, AQR Capital Management is a $24 billion asset management firm employing a disciplined multi-asset, global research process (AQR stands for Applied Quantitative Research). AQR's investment products are provided through a limited set of collective investment vehicles and separate accounts that deploy all or a subset of AQR's investment strategies. These investment products span from aggressive high volatility market-neutral hedge funds, to low volatility benchmark-driven traditional products. AQR is located in Greenwich, CT, a short commute from New York City. * If you are interested in applying for this position please submit your resume to matt.gombos at aqr.com and put "GAARD Candidate" in the subject line. * We are an affirmative action / equal opportunity employer. From josef.pktd at gmail.com Wed May 19 16:19:44 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 19 May 2010 16:19:44 -0400 Subject: [Numpy-discussion] question about creating numpy arrays In-Reply-To: References: Message-ID: On Wed, May 19, 2010 at 4:08 PM, Darren Dale wrote: > I have a question about creation of numpy arrays from a list of > objects, which bears on the Quantities project and also on masked > arrays: > >>>> import quantities as pq >>>> import numpy as np >>>> a, b = 2*pq.m,1*pq.s >>>> np.array([a, b]) > array([ 12., ? 1.]) > > Why doesn't that create an object array? Similarly: > >>>> m = np.ma.array([1], mask=[True]) >>>> m > masked_array(data = [--], > ? ? ? ? ? ? mask = [ True], > ? ? ? fill_value = 999999) > >>>> np.array([m]) > array([[1]]) > > This has broader implications than just creating arrays, for example: > >>>> np.sum([m, m]) > 2 >>>> np.sum([a, b]) > 13.0 > > Any thoughts? These are "array_like" of floats, so why should it create anything else than an array of floats. It's the most common usecase "array_like" is the most popular type for parameters in the docstrings Josef > > Thanks, > Darren > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From wccarithers at lbl.gov Wed May 19 16:24:07 2010 From: wccarithers at lbl.gov (William Carithers) Date: Wed, 19 May 2010 13:24:07 -0700 Subject: [Numpy-discussion] Runtime error in numpy.polyfit In-Reply-To: Message-ID: Hi Josef, I did the same test, namely opening a new window and plugging in the printout values by hand and polyfit worked just fine. Here's the terminal output: >>> import numpy as n >>> y = n.array([ 864., 865., 866., 867., 868.]) >>> a = n.array([ 5.24860191, 6.0217514 , 6.11434555, 6.09198856, 5.73753977]) >>> ypoly = n.polyfit(y,a,2) >>> ypoly array([ -1.69296264e-01, 2.93325941e+02, -1.27049334e+05]) I wonder if the step of printing plus cut and paste is doing some kind of implicit type conversion. Maybe the original problem has to do with data types? In the original code arcImage is integer data so the avalues array is constructed from avalues = n.log(n.array([...list of integers...])) Should I be doing some kind of casting first? Thanks, Bill On 5/19/10 1:09 PM, "josef.pktd at gmail.com" wrote: > On Wed, May 19, 2010 at 3:51 PM, William Carithers > wrote: >> Thanks David and Josef. Replies interspersed below. >> >> >> On 5/19/10 12:24 PM, "josef.pktd at gmail.com" wrote: >> >>> On Wed, May 19, 2010 at 3:18 PM, David Goldsmith >>> wrote: >>>> Charles H.: is this happening because he's calling the old version of >>>> polyfit? >>>> >>>> William: try using numpy.polynomial.polyfit instead, see if that works. >> >> It says ?ypoly = n.polynomial.polyfit(yvalues, avalues, 2) >> AttributeError: 'module' object has no attribute 'polynomial' >> >> Is this because I'm using a relatively old (numpy-1.2.1) version? >>>> >>>> DG >>>> >>>> On Wed, May 19, 2010 at 11:03 AM, William Carithers >>>> wrote: >>>>> >>>>> I'm trying to do a simple 2nd degree polynomial fit to two arrays of 5 >>>>> entries. I get a runtime error: >>>>> RuntimeError: more argument specifiers than keyword list entries >>>>> (remaining >>>>> format:'|:calc_lwork.gelss') ?in the lstsq module inside numpy.polyfit. >>>>> >>>>> Here's the code snippet: >>>>> def findPeak(self, ydex, xdex): >>>>> ? ? ? ?# take a vertical slice >>>>> ? ? ? ?vslice = [] >>>>> ? ? ? ?for i in range(-1,10,1) : >>>>> ? ? ? ? ? ?vslice.append(arcImage[ydex+i][xdex]) >>>>> ? ? ? ?vdex = n.where(vslice == max(vslice)) >>>>> ? ? ? ?ymax = ydex -1 + vdex[0][0] >>>>> ? ? ? ?# approximate gaussian fit by parabolic fit to logs >>>>> ? ? ? ?yvalues = n.array([ymax-2, ymax-1, ymax, ymax+1, ymax+2]) >>>>> >>>>> >>>>> svalues=n.array([arcImage[ymax-2][xdex],arcImage[ymax-1][xdex],arcImage[ym >>>>> ax >>>>> ][xdex],arcImage[ymax+1][xdex], arcImage[ymax+2][xdex]]) >>>>> ? ? ? ?avalues = n.log(svalues) >>>>> ? ? ? ?ypoly = n.polyfit(yvalues, avalues, 2) >>>>> >>>>> And the traceback: >>>>> File "/Users/williamcarithers/BOSS/src/calibrationModel.py", line 345, in >>>>> findPeak >>>>> ? ?ypoly = n.polyfit(yvalues, avalues, 2) >>>>> ?File >>>>> >>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/pytho >>>>> n/ >>>>> numpy/lib/polynomial.py", line 503, in polyfit >>>>> ? ?c, resids, rank, s = _lstsq(v, y, rcond) >>>>> ?File >>>>> >>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/pytho >>>>> n/ >>>>> numpy/lib/polynomial.py", line 46, in _lstsq >>>>> ? ?return lstsq(X, y, rcond) >>>>> ?File >>>>> >>>>> "/Library/Python/2.6/site-packages/scipy-0.7.1-py2.6-macosx-10.6-universal >>>>> .e >>>>> gg/scipy/linalg/basic.py", line 545, in lstsq >>>>> ? ?lwork = calc_lwork.gelss(gelss.prefix,m,n,nrhs)[1] >>>>> RuntimeError: more argument specifiers than keyword list entries >>>>> (remaining >>>>> format:'|:calc_lwork.gelss') >>>>> >>>>> This is such a simple application of polyfit and the error occurs in the >>>>> guts of lstsq, so I'm completely stumped. Any help would be greatly >>>>> appreciated. >>> >>> which version of numpy and the arguments to polyfit would be useful >>> information,e.g. >>> >>> print repr(yvalues) >>> print repr(avalues) >>> >>> before the call to polyfit >> >> Hi Josef, >> >> I'm using numpy-1.2.1 > > I don't remember whether 1.2.1 was fully python 2.6 compatible. I > would recommend upgrading if possible. > > I don't have any problems with more recent versions of scipy and numpy > >>>> import numpy as np >>>> y = np.array([ 864., 865., 866., 867., 868.]) >>>> x = np.array([ 5.24860191, 6.0217514 , 6.11434555, 6.09198856, >>>> 5.73753977],dtype=np.float32) >>>> np.polyfit(y, x, 2) > array([ -1.69296265e-01, 2.93325942e+02, -1.27049335e+05]) > > > I didn't know numpy will use the scipy version of linalg for this. > Do the scipy.test() pass? > > My guess would be that there are some incompatibilities with your > python/numpy/scipy versions. > > Josef > > >> >> Here are the arrays >> array([ 864., ?865., ?866., ?867., ?868.]) >> array([ 5.24860191, ?6.0217514 , ?6.11434555, ?6.09198856, ?5.73753977], >> dtype=float32) >> >> thanks >>> >>> Josef >>> >>> >>>>> >>>>> Thanks, >>>>> Bill Carithers >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>>> -- >>>> Mathematician: noun, someone who disavows certainty when their uncertainty >>>> set is non-empty, even if that set has measure zero. >>>> >>>> Hope: noun, that delusive spirit which escaped Pandora's jar and, with her >>>> lies, prevents mankind from committing a general suicide. ?(As interpreted >>>> by Robert Graves) >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From d.l.goldsmith at gmail.com Wed May 19 16:34:46 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Wed, 19 May 2010 13:34:46 -0700 Subject: [Numpy-discussion] Runtime error in numpy.polyfit In-Reply-To: References: Message-ID: The polynomial module definitely postdates 1.2.1; I echo Josef's rec. that you update if possible. On Wed, May 19, 2010 at 1:24 PM, William Carithers wrote: > Hi Josef, > > I didn't know numpy will use the scipy version of linalg for this. > Right, that's what told me he must be using an old (and to-be-deprecated) version of polyfit; IIRC, the new polynomial module is "all-numpy," right Charles? DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed May 19 16:35:36 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 19 May 2010 16:35:36 -0400 Subject: [Numpy-discussion] Runtime error in numpy.polyfit In-Reply-To: References: Message-ID: On Wed, May 19, 2010 at 4:24 PM, William Carithers wrote: > Hi Josef, > > I did the same test, namely opening a new window and plugging in the > printout values by hand and polyfit worked just fine. Here's the terminal > output: > >>>> import numpy as n >>>> y = n.array([ 864., ?865., ?866., ?867., ?868.]) >>>> a = n.array([ 5.24860191, ?6.0217514 , ?6.11434555, ?6.09198856, > 5.73753977]) here you dropped the ,dtype=np.float32) from your previous numbers >>>> ypoly = n.polyfit(y,a,2) >>>> ypoly > array([ -1.69296264e-01, ? 2.93325941e+02, ?-1.27049334e+05]) > > I wonder if the step of printing plus cut and paste is doing some kind of > implicit type conversion. Maybe the original problem has to do with data > types? In the original code arcImage is integer data so the avalues array is > constructed from > ?avalues = n.log(n.array([...list of integers...])) > > Should I be doing some kind of casting first? That's what I thought when I saw your dtype=np.float32 but from your repr print it looks like the y array is float64, and only the second is "non-standard" You could try to cast inside your function to float (float64) However, I think this is only a short-term solution, my guess is that your exception is a symptom for more serious/pervasive problems. Also, I don't know why in your example (if I interpret it correctly) np.log results in float32 >>> np.log(np.array([5,2],int)).dtype dtype('float64') Josef > > Thanks, > Bill > > > > > > On 5/19/10 1:09 PM, "josef.pktd at gmail.com" wrote: > >> On Wed, May 19, 2010 at 3:51 PM, William Carithers >> wrote: >>> Thanks David and Josef. Replies interspersed below. >>> >>> >>> On 5/19/10 12:24 PM, "josef.pktd at gmail.com" wrote: >>> >>>> On Wed, May 19, 2010 at 3:18 PM, David Goldsmith >>>> wrote: >>>>> Charles H.: is this happening because he's calling the old version of >>>>> polyfit? >>>>> >>>>> William: try using numpy.polynomial.polyfit instead, see if that works. >>> >>> It says ?ypoly = n.polynomial.polyfit(yvalues, avalues, 2) >>> AttributeError: 'module' object has no attribute 'polynomial' >>> >>> Is this because I'm using a relatively old (numpy-1.2.1) version? >>>>> >>>>> DG >>>>> >>>>> On Wed, May 19, 2010 at 11:03 AM, William Carithers >>>>> wrote: >>>>>> >>>>>> I'm trying to do a simple 2nd degree polynomial fit to two arrays of 5 >>>>>> entries. I get a runtime error: >>>>>> RuntimeError: more argument specifiers than keyword list entries >>>>>> (remaining >>>>>> format:'|:calc_lwork.gelss') ?in the lstsq module inside numpy.polyfit. >>>>>> >>>>>> Here's the code snippet: >>>>>> def findPeak(self, ydex, xdex): >>>>>> ? ? ? ?# take a vertical slice >>>>>> ? ? ? ?vslice = [] >>>>>> ? ? ? ?for i in range(-1,10,1) : >>>>>> ? ? ? ? ? ?vslice.append(arcImage[ydex+i][xdex]) >>>>>> ? ? ? ?vdex = n.where(vslice == max(vslice)) >>>>>> ? ? ? ?ymax = ydex -1 + vdex[0][0] >>>>>> ? ? ? ?# approximate gaussian fit by parabolic fit to logs >>>>>> ? ? ? ?yvalues = n.array([ymax-2, ymax-1, ymax, ymax+1, ymax+2]) >>>>>> >>>>>> >>>>>> svalues=n.array([arcImage[ymax-2][xdex],arcImage[ymax-1][xdex],arcImage[ym >>>>>> ax >>>>>> ][xdex],arcImage[ymax+1][xdex], arcImage[ymax+2][xdex]]) >>>>>> ? ? ? ?avalues = n.log(svalues) >>>>>> ? ? ? ?ypoly = n.polyfit(yvalues, avalues, 2) >>>>>> >>>>>> And the traceback: >>>>>> File "/Users/williamcarithers/BOSS/src/calibrationModel.py", line 345, in >>>>>> findPeak >>>>>> ? ?ypoly = n.polyfit(yvalues, avalues, 2) >>>>>> ?File >>>>>> >>>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/pytho >>>>>> n/ >>>>>> numpy/lib/polynomial.py", line 503, in polyfit >>>>>> ? ?c, resids, rank, s = _lstsq(v, y, rcond) >>>>>> ?File >>>>>> >>>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/pytho >>>>>> n/ >>>>>> numpy/lib/polynomial.py", line 46, in _lstsq >>>>>> ? ?return lstsq(X, y, rcond) >>>>>> ?File >>>>>> >>>>>> "/Library/Python/2.6/site-packages/scipy-0.7.1-py2.6-macosx-10.6-universal >>>>>> .e >>>>>> gg/scipy/linalg/basic.py", line 545, in lstsq >>>>>> ? ?lwork = calc_lwork.gelss(gelss.prefix,m,n,nrhs)[1] >>>>>> RuntimeError: more argument specifiers than keyword list entries >>>>>> (remaining >>>>>> format:'|:calc_lwork.gelss') >>>>>> >>>>>> This is such a simple application of polyfit and the error occurs in the >>>>>> guts of lstsq, so I'm completely stumped. Any help would be greatly >>>>>> appreciated. >>>> >>>> which version of numpy and the arguments to polyfit would be useful >>>> information,e.g. >>>> >>>> print repr(yvalues) >>>> print repr(avalues) >>>> >>>> before the call to polyfit >>> >>> Hi Josef, >>> >>> I'm using numpy-1.2.1 >> >> I don't remember whether 1.2.1 was fully python 2.6 compatible. I >> would recommend upgrading if possible. >> >> I don't have any problems with more recent versions of scipy and numpy >> >>>>> import numpy as np >>>>> y = np.array([ 864., ?865., ?866., ?867., ?868.]) >>>>> x = np.array([ 5.24860191, ?6.0217514 , ?6.11434555, ?6.09198856, >>>>> 5.73753977],dtype=np.float32) >>>>> np.polyfit(y, x, 2) >> array([ -1.69296265e-01, ? 2.93325942e+02, ?-1.27049335e+05]) >> >> >> I didn't know numpy will use the scipy version of linalg for this. >> Do the scipy.test() pass? >> >> My guess would be that there are some incompatibilities with your >> python/numpy/scipy versions. >> >> Josef >> >> >>> >>> Here are the arrays >>> array([ 864., ?865., ?866., ?867., ?868.]) >>> array([ 5.24860191, ?6.0217514 , ?6.11434555, ?6.09198856, ?5.73753977], >>> dtype=float32) >>> >>> thanks >>>> >>>> Josef >>>> >>>> >>>>>> >>>>>> Thanks, >>>>>> Bill Carithers >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>>> >>>>> -- >>>>> Mathematician: noun, someone who disavows certainty when their uncertainty >>>>> set is non-empty, even if that set has measure zero. >>>>> >>>>> Hope: noun, that delusive spirit which escaped Pandora's jar and, with her >>>>> lies, prevents mankind from committing a general suicide. ?(As interpreted >>>>> by Robert Graves) >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From wccarithers at lbl.gov Wed May 19 18:50:12 2010 From: wccarithers at lbl.gov (William Carithers) Date: Wed, 19 May 2010 15:50:12 -0700 Subject: [Numpy-discussion] Runtime error in numpy.polyfit In-Reply-To: Message-ID: Hi David and Josef, OK, I updated to numpy-1.4.1 and scipy-0.7.2 and this problem went away. Thanks for your help. BTW, trying to upgrade using the .dmg files from Sourceforge didn't work. It kept saying that it needed System Python 2.6 even though Python 2.6 is already installed. In fact, it was packaged with the OSX 10.6 upgrade. I had to download the tarballs and install from source. Cheers, Bill On 5/19/10 1:35 PM, "josef.pktd at gmail.com" wrote: > On Wed, May 19, 2010 at 4:24 PM, William Carithers > wrote: >> Hi Josef, >> >> I did the same test, namely opening a new window and plugging in the >> printout values by hand and polyfit worked just fine. Here's the terminal >> output: >> >>>>> import numpy as n >>>>> y = n.array([ 864., ?865., ?866., ?867., ?868.]) >>>>> a = n.array([ 5.24860191, ?6.0217514 , ?6.11434555, ?6.09198856, >> 5.73753977]) > > here you dropped the ,dtype=np.float32) from your previous numbers > >>>>> ypoly = n.polyfit(y,a,2) >>>>> ypoly >> array([ -1.69296264e-01, ? 2.93325941e+02, ?-1.27049334e+05]) >> >> I wonder if the step of printing plus cut and paste is doing some kind of >> implicit type conversion. Maybe the original problem has to do with data >> types? In the original code arcImage is integer data so the avalues array is >> constructed from >> ?avalues = n.log(n.array([...list of integers...])) >> >> Should I be doing some kind of casting first? > > That's what I thought when I saw your dtype=np.float32 > but from your repr print it looks like the y array is float64, and > only the second is "non-standard" > > You could try to cast inside your function to float (float64) > > However, I think this is only a short-term solution, my guess is that > your exception is a symptom for more serious/pervasive problems. > > Also, I don't know why in your example (if I interpret it correctly) > np.log results in float32 > >>>> np.log(np.array([5,2],int)).dtype > dtype('float64') > > Josef > >> >> Thanks, >> Bill >> >> >> >> >> >> On 5/19/10 1:09 PM, "josef.pktd at gmail.com" wrote: >> >>> On Wed, May 19, 2010 at 3:51 PM, William Carithers >>> wrote: >>>> Thanks David and Josef. Replies interspersed below. >>>> >>>> >>>> On 5/19/10 12:24 PM, "josef.pktd at gmail.com" wrote: >>>> >>>>> On Wed, May 19, 2010 at 3:18 PM, David Goldsmith >>>>> wrote: >>>>>> Charles H.: is this happening because he's calling the old version of >>>>>> polyfit? >>>>>> >>>>>> William: try using numpy.polynomial.polyfit instead, see if that works. >>>> >>>> It says ?ypoly = n.polynomial.polyfit(yvalues, avalues, 2) >>>> AttributeError: 'module' object has no attribute 'polynomial' >>>> >>>> Is this because I'm using a relatively old (numpy-1.2.1) version? >>>>>> >>>>>> DG >>>>>> >>>>>> On Wed, May 19, 2010 at 11:03 AM, William Carithers >>>>>> wrote: >>>>>>> >>>>>>> I'm trying to do a simple 2nd degree polynomial fit to two arrays of 5 >>>>>>> entries. I get a runtime error: >>>>>>> RuntimeError: more argument specifiers than keyword list entries >>>>>>> (remaining >>>>>>> format:'|:calc_lwork.gelss') ?in the lstsq module inside numpy.polyfit. >>>>>>> >>>>>>> Here's the code snippet: >>>>>>> def findPeak(self, ydex, xdex): >>>>>>> ? ? ? ?# take a vertical slice >>>>>>> ? ? ? ?vslice = [] >>>>>>> ? ? ? ?for i in range(-1,10,1) : >>>>>>> ? ? ? ? ? ?vslice.append(arcImage[ydex+i][xdex]) >>>>>>> ? ? ? ?vdex = n.where(vslice == max(vslice)) >>>>>>> ? ? ? ?ymax = ydex -1 + vdex[0][0] >>>>>>> ? ? ? ?# approximate gaussian fit by parabolic fit to logs >>>>>>> ? ? ? ?yvalues = n.array([ymax-2, ymax-1, ymax, ymax+1, ymax+2]) >>>>>>> >>>>>>> >>>>>>> svalues=n.array([arcImage[ymax-2][xdex],arcImage[ymax-1][xdex],arcImage[ >>>>>>> ym >>>>>>> ax >>>>>>> ][xdex],arcImage[ymax+1][xdex], arcImage[ymax+2][xdex]]) >>>>>>> ? ? ? ?avalues = n.log(svalues) >>>>>>> ? ? ? ?ypoly = n.polyfit(yvalues, avalues, 2) >>>>>>> >>>>>>> And the traceback: >>>>>>> File "/Users/williamcarithers/BOSS/src/calibrationModel.py", line 345, >>>>>>> in >>>>>>> findPeak >>>>>>> ? ?ypoly = n.polyfit(yvalues, avalues, 2) >>>>>>> ?File >>>>>>> >>>>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/pyt >>>>>>> ho >>>>>>> n/ >>>>>>> numpy/lib/polynomial.py", line 503, in polyfit >>>>>>> ? ?c, resids, rank, s = _lstsq(v, y, rcond) >>>>>>> ?File >>>>>>> >>>>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/pyt >>>>>>> ho >>>>>>> n/ >>>>>>> numpy/lib/polynomial.py", line 46, in _lstsq >>>>>>> ? ?return lstsq(X, y, rcond) >>>>>>> ?File >>>>>>> >>>>>>> "/Library/Python/2.6/site-packages/scipy-0.7.1-py2.6-macosx-10.6-univers >>>>>>> al >>>>>>> .e >>>>>>> gg/scipy/linalg/basic.py", line 545, in lstsq >>>>>>> ? ?lwork = calc_lwork.gelss(gelss.prefix,m,n,nrhs)[1] >>>>>>> RuntimeError: more argument specifiers than keyword list entries >>>>>>> (remaining >>>>>>> format:'|:calc_lwork.gelss') >>>>>>> >>>>>>> This is such a simple application of polyfit and the error occurs in the >>>>>>> guts of lstsq, so I'm completely stumped. Any help would be greatly >>>>>>> appreciated. >>>>> >>>>> which version of numpy and the arguments to polyfit would be useful >>>>> information,e.g. >>>>> >>>>> print repr(yvalues) >>>>> print repr(avalues) >>>>> >>>>> before the call to polyfit >>>> >>>> Hi Josef, >>>> >>>> I'm using numpy-1.2.1 >>> >>> I don't remember whether 1.2.1 was fully python 2.6 compatible. I >>> would recommend upgrading if possible. >>> >>> I don't have any problems with more recent versions of scipy and numpy >>> >>>>>> import numpy as np >>>>>> y = np.array([ 864., ?865., ?866., ?867., ?868.]) >>>>>> x = np.array([ 5.24860191, ?6.0217514 , ?6.11434555, ?6.09198856, >>>>>> 5.73753977],dtype=np.float32) >>>>>> np.polyfit(y, x, 2) >>> array([ -1.69296265e-01, ? 2.93325942e+02, ?-1.27049335e+05]) >>> >>> >>> I didn't know numpy will use the scipy version of linalg for this. >>> Do the scipy.test() pass? >>> >>> My guess would be that there are some incompatibilities with your >>> python/numpy/scipy versions. >>> >>> Josef >>> >>> >>>> >>>> Here are the arrays >>>> array([ 864., ?865., ?866., ?867., ?868.]) >>>> array([ 5.24860191, ?6.0217514 , ?6.11434555, ?6.09198856, ?5.73753977], >>>> dtype=float32) >>>> >>>> thanks >>>>> >>>>> Josef >>>>> >>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Bill Carithers >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> NumPy-Discussion mailing list >>>>>>> NumPy-Discussion at scipy.org >>>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Mathematician: noun, someone who disavows certainty when their >>>>>> uncertainty >>>>>> set is non-empty, even if that set has measure zero. >>>>>> >>>>>> Hope: noun, that delusive spirit which escaped Pandora's jar and, with >>>>>> her >>>>>> lies, prevents mankind from committing a general suicide. ?(As >>>>>> interpreted >>>>>> by Robert Graves) >>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From steven.nien at gmail.com Wed May 19 19:43:21 2010 From: steven.nien at gmail.com (Steven Nien) Date: Thu, 20 May 2010 07:43:21 +0800 Subject: [Numpy-discussion] Problem in swig Message-ID: I want to pass 1 integer and 2 numpy array into the C function as followings: void update_incident_field(int k, float *a, int na, float *b, int nb) { for (i=0; i From dsdale24 at gmail.com Wed May 19 20:06:42 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 19 May 2010 20:06:42 -0400 Subject: [Numpy-discussion] question about creating numpy arrays In-Reply-To: References: Message-ID: On Wed, May 19, 2010 at 4:19 PM, wrote: > On Wed, May 19, 2010 at 4:08 PM, Darren Dale wrote: >> I have a question about creation of numpy arrays from a list of >> objects, which bears on the Quantities project and also on masked >> arrays: >> >>>>> import quantities as pq >>>>> import numpy as np >>>>> a, b = 2*pq.m,1*pq.s >>>>> np.array([a, b]) >> array([ 12., ? 1.]) >> >> Why doesn't that create an object array? Similarly: >> >>>>> m = np.ma.array([1], mask=[True]) >>>>> m >> masked_array(data = [--], >> ? ? ? ? ? ? mask = [ True], >> ? ? ? fill_value = 999999) >> >>>>> np.array([m]) >> array([[1]]) >> >> This has broader implications than just creating arrays, for example: >> >>>>> np.sum([m, m]) >> 2 >>>>> np.sum([a, b]) >> 13.0 >> >> Any thoughts? > > These are "array_like" of floats, so why should it create anything > else than an array of floats. I gave two counterexamples of why. From d.l.goldsmith at gmail.com Wed May 19 22:17:26 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Wed, 19 May 2010 19:17:26 -0700 Subject: [Numpy-discussion] Runtime error in numpy.polyfit In-Reply-To: References: Message-ID: On Wed, May 19, 2010 at 3:50 PM, William Carithers wrote: > Hi David and Josef, > > OK, I updated to numpy-1.4.1 and scipy-0.7.2 and this problem went away. > Thanks for your help. > > BTW, trying to upgrade using the .dmg files from Sourceforge didn't work. > It > kept saying that it needed System Python 2.6 even though Python 2.6 is > already installed. In fact, it was packaged with the OSX 10.6 upgrade. I > had > to download the tarballs and install from source. > Yeah, Chris Barker typically recommends that Mac users get the dmgs from here: http://wiki.python.org/moin/MacPython/Packages but they don't appear to have anything for Python 2.6 yet. :-( Chris, any ideas? DG > > Cheers, > Bill > > > On 5/19/10 1:35 PM, "josef.pktd at gmail.com" wrote: > > > On Wed, May 19, 2010 at 4:24 PM, William Carithers > > wrote: > >> Hi Josef, > >> > >> I did the same test, namely opening a new window and plugging in the > >> printout values by hand and polyfit worked just fine. Here's the > terminal > >> output: > >> > >>>>> import numpy as n > >>>>> y = n.array([ 864., 865., 866., 867., 868.]) > >>>>> a = n.array([ 5.24860191, 6.0217514 , 6.11434555, 6.09198856, > >> 5.73753977]) > > > > here you dropped the ,dtype=np.float32) from your previous numbers > > > >>>>> ypoly = n.polyfit(y,a,2) > >>>>> ypoly > >> array([ -1.69296264e-01, 2.93325941e+02, -1.27049334e+05]) > >> > >> I wonder if the step of printing plus cut and paste is doing some kind > of > >> implicit type conversion. Maybe the original problem has to do with data > >> types? In the original code arcImage is integer data so the avalues > array is > >> constructed from > >> avalues = n.log(n.array([...list of integers...])) > >> > >> Should I be doing some kind of casting first? > > > > That's what I thought when I saw your dtype=np.float32 > > but from your repr print it looks like the y array is float64, and > > only the second is "non-standard" > > > > You could try to cast inside your function to float (float64) > > > > However, I think this is only a short-term solution, my guess is that > > your exception is a symptom for more serious/pervasive problems. > > > > Also, I don't know why in your example (if I interpret it correctly) > > np.log results in float32 > > > >>>> np.log(np.array([5,2],int)).dtype > > dtype('float64') > > > > Josef > > > >> > >> Thanks, > >> Bill > >> > >> > >> > >> > >> > >> On 5/19/10 1:09 PM, "josef.pktd at gmail.com" > wrote: > >> > >>> On Wed, May 19, 2010 at 3:51 PM, William Carithers < > wccarithers at lbl.gov> > >>> wrote: > >>>> Thanks David and Josef. Replies interspersed below. > >>>> > >>>> > >>>> On 5/19/10 12:24 PM, "josef.pktd at gmail.com" > wrote: > >>>> > >>>>> On Wed, May 19, 2010 at 3:18 PM, David Goldsmith > >>>>> wrote: > >>>>>> Charles H.: is this happening because he's calling the old version > of > >>>>>> polyfit? > >>>>>> > >>>>>> William: try using numpy.polynomial.polyfit instead, see if that > works. > >>>> > >>>> It says ypoly = n.polynomial.polyfit(yvalues, avalues, 2) > >>>> AttributeError: 'module' object has no attribute 'polynomial' > >>>> > >>>> Is this because I'm using a relatively old (numpy-1.2.1) version? > >>>>>> > >>>>>> DG > >>>>>> > >>>>>> On Wed, May 19, 2010 at 11:03 AM, William Carithers < > wccarithers at lbl.gov> > >>>>>> wrote: > >>>>>>> > >>>>>>> I'm trying to do a simple 2nd degree polynomial fit to two arrays > of 5 > >>>>>>> entries. I get a runtime error: > >>>>>>> RuntimeError: more argument specifiers than keyword list entries > >>>>>>> (remaining > >>>>>>> format:'|:calc_lwork.gelss') in the lstsq module inside > numpy.polyfit. > >>>>>>> > >>>>>>> Here's the code snippet: > >>>>>>> def findPeak(self, ydex, xdex): > >>>>>>> # take a vertical slice > >>>>>>> vslice = [] > >>>>>>> for i in range(-1,10,1) : > >>>>>>> vslice.append(arcImage[ydex+i][xdex]) > >>>>>>> vdex = n.where(vslice == max(vslice)) > >>>>>>> ymax = ydex -1 + vdex[0][0] > >>>>>>> # approximate gaussian fit by parabolic fit to logs > >>>>>>> yvalues = n.array([ymax-2, ymax-1, ymax, ymax+1, ymax+2]) > >>>>>>> > >>>>>>> > >>>>>>> > svalues=n.array([arcImage[ymax-2][xdex],arcImage[ymax-1][xdex],arcImage[ > >>>>>>> ym > >>>>>>> ax > >>>>>>> ][xdex],arcImage[ymax+1][xdex], arcImage[ymax+2][xdex]]) > >>>>>>> avalues = n.log(svalues) > >>>>>>> ypoly = n.polyfit(yvalues, avalues, 2) > >>>>>>> > >>>>>>> And the traceback: > >>>>>>> File "/Users/williamcarithers/BOSS/src/calibrationModel.py", line > 345, > >>>>>>> in > >>>>>>> findPeak > >>>>>>> ypoly = n.polyfit(yvalues, avalues, 2) > >>>>>>> File > >>>>>>> > >>>>>>> > "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/pyt > >>>>>>> ho > >>>>>>> n/ > >>>>>>> numpy/lib/polynomial.py", line 503, in polyfit > >>>>>>> c, resids, rank, s = _lstsq(v, y, rcond) > >>>>>>> File > >>>>>>> > >>>>>>> > "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/pyt > >>>>>>> ho > >>>>>>> n/ > >>>>>>> numpy/lib/polynomial.py", line 46, in _lstsq > >>>>>>> return lstsq(X, y, rcond) > >>>>>>> File > >>>>>>> > >>>>>>> > "/Library/Python/2.6/site-packages/scipy-0.7.1-py2.6-macosx-10.6-univers > >>>>>>> al > >>>>>>> .e > >>>>>>> gg/scipy/linalg/basic.py", line 545, in lstsq > >>>>>>> lwork = calc_lwork.gelss(gelss.prefix,m,n,nrhs)[1] > >>>>>>> RuntimeError: more argument specifiers than keyword list entries > >>>>>>> (remaining > >>>>>>> format:'|:calc_lwork.gelss') > >>>>>>> > >>>>>>> This is such a simple application of polyfit and the error occurs > in the > >>>>>>> guts of lstsq, so I'm completely stumped. Any help would be greatly > >>>>>>> appreciated. > >>>>> > >>>>> which version of numpy and the arguments to polyfit would be useful > >>>>> information,e.g. > >>>>> > >>>>> print repr(yvalues) > >>>>> print repr(avalues) > >>>>> > >>>>> before the call to polyfit > >>>> > >>>> Hi Josef, > >>>> > >>>> I'm using numpy-1.2.1 > >>> > >>> I don't remember whether 1.2.1 was fully python 2.6 compatible. I > >>> would recommend upgrading if possible. > >>> > >>> I don't have any problems with more recent versions of scipy and numpy > >>> > >>>>>> import numpy as np > >>>>>> y = np.array([ 864., 865., 866., 867., 868.]) > >>>>>> x = np.array([ 5.24860191, 6.0217514 , 6.11434555, 6.09198856, > >>>>>> 5.73753977],dtype=np.float32) > >>>>>> np.polyfit(y, x, 2) > >>> array([ -1.69296265e-01, 2.93325942e+02, -1.27049335e+05]) > >>> > >>> > >>> I didn't know numpy will use the scipy version of linalg for this. > >>> Do the scipy.test() pass? > >>> > >>> My guess would be that there are some incompatibilities with your > >>> python/numpy/scipy versions. > >>> > >>> Josef > >>> > >>> > >>>> > >>>> Here are the arrays > >>>> array([ 864., 865., 866., 867., 868.]) > >>>> array([ 5.24860191, 6.0217514 , 6.11434555, 6.09198856, > 5.73753977], > >>>> dtype=float32) > >>>> > >>>> thanks > >>>>> > >>>>> Josef > >>>>> > >>>>> > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Bill Carithers > >>>>>>> > >>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> NumPy-Discussion mailing list > >>>>>>> NumPy-Discussion at scipy.org > >>>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Mathematician: noun, someone who disavows certainty when their > >>>>>> uncertainty > >>>>>> set is non-empty, even if that set has measure zero. > >>>>>> > >>>>>> Hope: noun, that delusive spirit which escaped Pandora's jar and, > with > >>>>>> her > >>>>>> lies, prevents mankind from committing a general suicide. (As > >>>>>> interpreted > >>>>>> by Robert Graves) > >>>>>> > >>>>>> _______________________________________________ > >>>>>> NumPy-Discussion mailing list > >>>>>> NumPy-Discussion at scipy.org > >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>>>>> > >>>>>> > >>>>> _______________________________________________ > >>>>> NumPy-Discussion mailing list > >>>>> NumPy-Discussion at scipy.org > >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>>> > >>>> > >>>> _______________________________________________ > >>>> NumPy-Discussion mailing list > >>>> NumPy-Discussion at scipy.org > >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>>> > >>> _______________________________________________ > >>> NumPy-Discussion mailing list > >>> NumPy-Discussion at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves) -------------- next part -------------- An HTML attachment was scrubbed... URL: From seb.haase at gmail.com Thu May 20 03:32:52 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Thu, 20 May 2010 09:32:52 +0200 Subject: [Numpy-discussion] Problem in swig In-Reply-To: References: Message-ID: Hi, I don't know exactly, but try replacing the one line %apply (float* INPLACE_ARRAY1, int DIM1) {(float *a, int na), (float *b, int nb)}; with two lines: %apply (float* INPLACE_ARRAY1, int DIM1) {(float *a, int na)}; %apply (float* INPLACE_ARRAY1, int DIM1) {(float *b, int nb)}; Don't know about the '{' '}' brakets. b could probably be "more general" as INPUT_ARRAY1 HTH -S. On Thu, May 20, 2010 at 1:43 AM, Steven Nien wrote: > I want to pass 1 integer and 2 numpy array into the C function as > followings: > > void update_incident_field(int k, float *a, int na, float *b, int nb) { > for (i=0; i ??? a[i] = a[i] + b[i] * k; > } > } > > But I don't know how to write the interface code (***.i) > Can someone help me? > > Thanks! > > The swig interface code I written(did't work, strange output) > > %module example > %{ > #define SWIG_FILE_WITH_INIT > #include "example.h" > %} > %include "numpy.i" > > %init %{ > ??? import_array(); > %} > > %apply (float* INPLACE_ARRAY1, int DIM1) {(float *a, int na), (float *b, int > nb)}; > > %include "example.h" > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From ralf.gommers at googlemail.com Thu May 20 06:10:41 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 20 May 2010 18:10:41 +0800 Subject: [Numpy-discussion] Runtime error in numpy.polyfit In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 10:17 AM, David Goldsmith wrote: > On Wed, May 19, 2010 at 3:50 PM, William Carithers wrote: > >> Hi David and Josef, >> >> OK, I updated to numpy-1.4.1 and scipy-0.7.2 and this problem went away. >> Thanks for your help. >> >> BTW, trying to upgrade using the .dmg files from Sourceforge didn't work. >> It >> kept saying that it needed System Python 2.6 even though Python 2.6 is >> already installed. In fact, it was packaged with the OSX 10.6 upgrade. I >> had >> to download the tarballs and install from source. >> > The numpy/scipy binaries on sourceforge are built for the dmg installers from http://python.org/, which is the recommended place to get python from. The python provided by Apple is outdated by several release cycles, and can not be used with those binaries. I agree the help message about needing system python is not particularly helpful. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Thu May 20 10:44:41 2010 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 20 May 2010 09:44:41 -0500 Subject: [Numpy-discussion] question about creating numpy arrays In-Reply-To: References: Message-ID: > > I gave two counterexamples of why. > The examples you gave aren't counterexamples. See below... On Wed, May 19, 2010 at 7:06 PM, Darren Dale wrote: > On Wed, May 19, 2010 at 4:19 PM, wrote: > > On Wed, May 19, 2010 at 4:08 PM, Darren Dale wrote: > >> I have a question about creation of numpy arrays from a list of > >> objects, which bears on the Quantities project and also on masked > >> arrays: > >> > >>>>> import quantities as pq > >>>>> import numpy as np > >>>>> a, b = 2*pq.m,1*pq.s > >>>>> np.array([a, b]) > >> array([ 12., 1.]) > >> > >> Why doesn't that create an object array? Similarly: > >> > Consider the use case of a person creating a 1-D numpy array: > np.array([12.0, 1.0]) array([ 12., 1.]) How is python supposed to tell the difference between > np.array([a, b]) and > np.array([12.0, 1.0]) ? It can't, and there are plenty of times when one wants to explicitly initialize a small numpy array with a few discrete variables. > >>>>> m = np.ma.array([1], mask=[True]) > >>>>> m > >> masked_array(data = [--], > >> mask = [ True], > >> fill_value = 999999) > >> > >>>>> np.array([m]) > >> array([[1]]) > >> > Again, this is expected behavior. Numpy saw an array of an array, therefore, it produced a 2-D array. Consider the following: > np.array([[12, 4, 1], [32, 51, 9]]) I, as a user, expect numpy to create a 2-D array (2 rows, 3 columns) from that array of arrays. > >> This has broader implications than just creating arrays, for example: > >> > >>>>> np.sum([m, m]) > >> 2 > >>>>> np.sum([a, b]) > >> 13.0 > >> > If you wanted sums from each object, there are some better (i.e., more clear) ways to go about it. If you have a predetermined number of numpy-compatible objects, say a, b, c, then you can explicitly call the sum for each one: > a_sum = np.sum(a) > b_sum = np.sum(b) > c_sum = np.sum(c) Which I think communicates the programmer's intention better than (for a numpy array, x, composed of a, b, c): > object_sums = np.sum(x) # <--- As a numpy user, I would expect a scalar out of this, not an array If you have an arbitrary number of objects (which is what I suspect you have), then one could easily produce an array of sums (for a list, x, of numpy-compatible objects) like so: > object_sums = [np.sum(anObject) for anObject in x] Performance-wise, it should be no more or less efficient than having numpy somehow produce an array of sums from a single call to sum. Readability-wise, it makes more sense because when you are treating objects separately, a *list* of them is more intuitive than a numpy.array, which is more-or-less treated as a single mathematical entity. I hope that addresses your concerns. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Thu May 20 11:30:47 2010 From: rmay31 at gmail.com (Ryan May) Date: Thu, 20 May 2010 10:30:47 -0500 Subject: [Numpy-discussion] question about creating numpy arrays In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 9:44 AM, Benjamin Root wrote: >> I gave two counterexamples of why. > > The examples you gave aren't counterexamples.? See below... > > On Wed, May 19, 2010 at 7:06 PM, Darren Dale wrote: >> >> On Wed, May 19, 2010 at 4:19 PM, ? wrote: >> > On Wed, May 19, 2010 at 4:08 PM, Darren Dale wrote: >> >> I have a question about creation of numpy arrays from a list of >> >> objects, which bears on the Quantities project and also on masked >> >> arrays: >> >> >> >>>>> import quantities as pq >> >>>>> import numpy as np >> >>>>> a, b = 2*pq.m,1*pq.s >> >>>>> np.array([a, b]) >> >> array([ 12., ? 1.]) >> >> >> >> Why doesn't that create an object array? Similarly: >> >> > > > Consider the use case of a person creating a 1-D numpy array: > ?> np.array([12.0, 1.0]) > array([ 12.,? 1.]) > > How is python supposed to tell the difference between > ?> np.array([a, b]) > and > ?> np.array([12.0, 1.0]) > ? > > It can't, and there are plenty of times when one wants to explicitly > initialize a small numpy array with a few discrete variables. What do you mean it can't? 12.0 and 1.0 are floats, a and b are not. While, yes, they can be coerced to floats, this is a *lossy* transformation--it strips away information contained in the class, and IMHO should not be the default behavior. If I want the objects, I can force it: In [7]: np.array([a,b],dtype=np.object) Out[7]: array([2.0 m, 1.0 s], dtype=object) This works fine, but feels ugly since I have to explicitly tell numpy not to do something. It feels to me like it's violating the principle of "in the face of ambiguity, resist the temptation to guess." Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From dsdale24 at gmail.com Thu May 20 11:37:50 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Thu, 20 May 2010 11:37:50 -0400 Subject: [Numpy-discussion] question about creating numpy arrays In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 10:44 AM, Benjamin Root wrote: >> I gave two counterexamples of why. > > The examples you gave aren't counterexamples.? See below... I'm not interested in arguing over semantics. I've discovered an issue with how numpy deals with lists of objects that derive from ndarray, and am concerned about the implications for classes that extend ndarray. > On Wed, May 19, 2010 at 7:06 PM, Darren Dale wrote: >> >> On Wed, May 19, 2010 at 4:19 PM, ? wrote: >> > On Wed, May 19, 2010 at 4:08 PM, Darren Dale wrote: >> >> I have a question about creation of numpy arrays from a list of >> >> objects, which bears on the Quantities project and also on masked >> >> arrays: >> >> >> >>>>> import quantities as pq >> >>>>> import numpy as np >> >>>>> a, b = 2*pq.m,1*pq.s >> >>>>> np.array([a, b]) >> >> array([ 12., ? 1.]) >> >> >> >> Why doesn't that create an object array? Similarly: >> >> > > > Consider the use case of a person creating a 1-D numpy array: > ?> np.array([12.0, 1.0]) > array([ 12.,? 1.]) > > How is python supposed to tell the difference between > ?> np.array([a, b]) > and > ?> np.array([12.0, 1.0]) > ? > > It can't, and there are plenty of times when one wants to explicitly > initialize a small numpy array with a few discrete variables. > > >> >> >>>>> m = np.ma.array([1], mask=[True]) >> >>>>> m >> >> masked_array(data = [--], >> >> ? ? ? ? ? ? mask = [ True], >> >> ? ? ? fill_value = 999999) >> >> >> >>>>> np.array([m]) >> >> array([[1]]) >> >> > > Again, this is expected behavior.? Numpy saw an array of an array, > therefore, it produced a 2-D array. Consider the following: > > ?> np.array([[12, 4, 1], [32, 51, 9]]) > > I, as a user, expect numpy to create a 2-D array (2 rows, 3 columns) from > that array of arrays. > >> >> >> This has broader implications than just creating arrays, for example: >> >> >> >>>>> np.sum([m, m]) >> >> 2 >> >>>>> np.sum([a, b]) >> >> 13.0 >> >> > > > If you wanted sums from each object, there are some better (i.e., more > clear) ways to go about it.? If you have a predetermined number of > numpy-compatible objects, say a, b, c, then you can explicitly call the sum > for each one: > ?> a_sum = np.sum(a) > ?> b_sum = np.sum(b) > ?> c_sum = np.sum(c) > > Which I think communicates the programmer's intention better than (for a > numpy array, x, composed of a, b, c): > ?> object_sums = np.sum(x)?????? # <--- As a numpy user, I would expect a > scalar out of this, not an array > > If you have an arbitrary number of objects (which is what I suspect you > have), then one could easily produce an array of sums (for a list, x, of > numpy-compatible objects) like so: > ?> object_sums = [np.sum(anObject) for anObject in x] > > Performance-wise, it should be no more or less efficient than having numpy > somehow produce an array of sums from a single call to sum. > Readability-wise, it makes more sense because when you are treating objects > separately, a *list* of them is more intuitive than a numpy.array, which is > more-or-less treated as a single mathematical entity. > > I hope that addresses your concerns. I appreciate the response, but you are arguing that it is not a problem, and I'm certain that it is. It may not be numpy From wfspotz at sandia.gov Thu May 20 11:18:44 2010 From: wfspotz at sandia.gov (Bill Spotz) Date: Thu, 20 May 2010 11:18:44 -0400 Subject: [Numpy-discussion] Problem in swig In-Reply-To: References: Message-ID: I tried the following: %module example %{ #define SWIG_FILE_WITH_INIT //#include "example.h" %} %include "numpy.i" %init %{ import_array(); %} %apply (float* INPLACE_ARRAY1, int DIM1) {(float *a, int na), (float *b, int nb)}; %inline { void update_incident_field(int k, float *a, int na, float *b, int nb) { for (i=0; i I want to pass 1 integer and 2 numpy array into the C function as > followings: > > void update_incident_field(int k, float *a, int na, float *b, int > nb) { > for (i=0; i a[i] = a[i] + b[i] * k; > } > } > > But I don't know how to write the interface code (***.i) > Can someone help me? > > Thanks! > > The swig interface code I written(did't work, strange output) > > %module example > %{ > #define SWIG_FILE_WITH_INIT > #include "example.h" > %} > %include "numpy.i" > > %init %{ > import_array(); > %} > > %apply (float* INPLACE_ARRAY1, int DIM1) {(float *a, int na), (float > *b, int nb)}; > > %include "example.h" > > > > ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From dsdale24 at gmail.com Thu May 20 11:53:09 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Thu, 20 May 2010 11:53:09 -0400 Subject: [Numpy-discussion] question about creating numpy arrays In-Reply-To: References: Message-ID: [sorry, my last got cut off] On Thu, May 20, 2010 at 11:37 AM, Darren Dale wrote: > On Thu, May 20, 2010 at 10:44 AM, Benjamin Root wrote: >>> I gave two counterexamples of why. >> >> The examples you gave aren't counterexamples.? See below... > > I'm not interested in arguing over semantics. I've discovered an issue > with how numpy deals with lists of objects that derive from ndarray, > and am concerned about the implications for classes that extend > ndarray. > >> On Wed, May 19, 2010 at 7:06 PM, Darren Dale wrote: >>> >>> On Wed, May 19, 2010 at 4:19 PM, ? wrote: >>> > On Wed, May 19, 2010 at 4:08 PM, Darren Dale wrote: >>> >> I have a question about creation of numpy arrays from a list of >>> >> objects, which bears on the Quantities project and also on masked >>> >> arrays: >>> >> >>> >>>>> import quantities as pq >>> >>>>> import numpy as np >>> >>>>> a, b = 2*pq.m,1*pq.s >>> >>>>> np.array([a, b]) >>> >> array([ 12., ? 1.]) >>> >> >>> >> Why doesn't that create an object array? Similarly: >>> >> >> >> >> Consider the use case of a person creating a 1-D numpy array: >> ?> np.array([12.0, 1.0]) >> array([ 12.,? 1.]) >> >> How is python supposed to tell the difference between >> ?> np.array([a, b]) >> and >> ?> np.array([12.0, 1.0]) >> ? >> >> It can't, and there are plenty of times when one wants to explicitly >> initialize a small numpy array with a few discrete variables. >> >> >>> >>> >>>>> m = np.ma.array([1], mask=[True]) >>> >>>>> m >>> >> masked_array(data = [--], >>> >> ? ? ? ? ? ? mask = [ True], >>> >> ? ? ? fill_value = 999999) >>> >> >>> >>>>> np.array([m]) >>> >> array([[1]]) >>> >> >> >> Again, this is expected behavior.? Numpy saw an array of an array, >> therefore, it produced a 2-D array. Consider the following: >> >> ?> np.array([[12, 4, 1], [32, 51, 9]]) >> >> I, as a user, expect numpy to create a 2-D array (2 rows, 3 columns) from >> that array of arrays. >> >>> >>> >> This has broader implications than just creating arrays, for example: >>> >> >>> >>>>> np.sum([m, m]) >>> >> 2 >>> >>>>> np.sum([a, b]) >>> >> 13.0 >>> >> >> >> >> If you wanted sums from each object, there are some better (i.e., more >> clear) ways to go about it.? If you have a predetermined number of >> numpy-compatible objects, say a, b, c, then you can explicitly call the sum >> for each one: >> ?> a_sum = np.sum(a) >> ?> b_sum = np.sum(b) >> ?> c_sum = np.sum(c) >> >> Which I think communicates the programmer's intention better than (for a >> numpy array, x, composed of a, b, c): >> ?> object_sums = np.sum(x)?????? # <--- As a numpy user, I would expect a >> scalar out of this, not an array >> >> If you have an arbitrary number of objects (which is what I suspect you >> have), then one could easily produce an array of sums (for a list, x, of >> numpy-compatible objects) like so: >> ?> object_sums = [np.sum(anObject) for anObject in x] >> >> Performance-wise, it should be no more or less efficient than having numpy >> somehow produce an array of sums from a single call to sum. >> Readability-wise, it makes more sense because when you are treating objects >> separately, a *list* of them is more intuitive than a numpy.array, which is >> more-or-less treated as a single mathematical entity. >> >> I hope that addresses your concerns. > > I appreciate the response, but you are arguing that it is not a > problem, and I'm certain that it is. It may not be numpy It may not be numpy's problem, I can accept that. But it is definitely a problem for quantities. I'm trying to determine just how big a problem it is. I had hoped that one day quantities might become a part of numpy or scipy, but this appears to be a fundamental issue and it makes me doubt that inclusion would be appropriate. Thank you for the suggestion about calling the sum method instead of numpy's function. That is a reasonable workaround. Darren From bsouthey at gmail.com Thu May 20 12:07:24 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 20 May 2010 11:07:24 -0500 Subject: [Numpy-discussion] question about creating numpy arrays In-Reply-To: References: Message-ID: <4BF55E3C.2000907@gmail.com> On 05/20/2010 10:53 AM, Darren Dale wrote: > [sorry, my last got cut off] > > On Thu, May 20, 2010 at 11:37 AM, Darren Dale wrote: > >> On Thu, May 20, 2010 at 10:44 AM, Benjamin Root wrote: >> >>>> I gave two counterexamples of why. >>>> >>> The examples you gave aren't counterexamples. See below... >>> >> I'm not interested in arguing over semantics. I've discovered an issue >> with how numpy deals with lists of objects that derive from ndarray, >> and am concerned about the implications for classes that extend >> ndarray. >> >> >>> On Wed, May 19, 2010 at 7:06 PM, Darren Dale wrote: >>> >>>> On Wed, May 19, 2010 at 4:19 PM, wrote: >>>> >>>>> On Wed, May 19, 2010 at 4:08 PM, Darren Dale wrote: >>>>> >>>>>> I have a question about creation of numpy arrays from a list of >>>>>> objects, which bears on the Quantities project and also on masked >>>>>> arrays: >>>>>> >>>>>> >>>>>>>>> import quantities as pq >>>>>>>>> import numpy as np >>>>>>>>> a, b = 2*pq.m,1*pq.s >>>>>>>>> np.array([a, b]) >>>>>>>>> >>>>>> array([ 12., 1.]) >>>>>> >>>>>> Why doesn't that create an object array? Similarly: >>>>>> >>>>>> >>> >>> Consider the use case of a person creating a 1-D numpy array: >>> > np.array([12.0, 1.0]) >>> array([ 12., 1.]) >>> >>> How is python supposed to tell the difference between >>> > np.array([a, b]) >>> and >>> > np.array([12.0, 1.0]) >>> ? >>> >>> It can't, and there are plenty of times when one wants to explicitly >>> initialize a small numpy array with a few discrete variables. >>> >>> >>> >>>> >>>>>>>>> m = np.ma.array([1], mask=[True]) >>>>>>>>> m >>>>>>>>> >>>>>> masked_array(data = [--], >>>>>> mask = [ True], >>>>>> fill_value = 999999) >>>>>> >>>>>> >>>>>>>>> np.array([m]) >>>>>>>>> >>>>>> array([[1]]) >>>>>> >>>>>> >>> Again, this is expected behavior. Numpy saw an array of an array, >>> therefore, it produced a 2-D array. Consider the following: >>> >>> > np.array([[12, 4, 1], [32, 51, 9]]) >>> >>> I, as a user, expect numpy to create a 2-D array (2 rows, 3 columns) from >>> that array of arrays. >>> >>> >>>> >>>>>> This has broader implications than just creating arrays, for example: >>>>>> >>>>>> >>>>>>>>> np.sum([m, m]) >>>>>>>>> >>>>>> 2 >>>>>> >>>>>>>>> np.sum([a, b]) >>>>>>>>> >>>>>> 13.0 >>>>>> >>>>>> >>> >>> If you wanted sums from each object, there are some better (i.e., more >>> clear) ways to go about it. If you have a predetermined number of >>> numpy-compatible objects, say a, b, c, then you can explicitly call the sum >>> for each one: >>> > a_sum = np.sum(a) >>> > b_sum = np.sum(b) >>> > c_sum = np.sum(c) >>> >>> Which I think communicates the programmer's intention better than (for a >>> numpy array, x, composed of a, b, c): >>> > object_sums = np.sum(x) #<--- As a numpy user, I would expect a >>> scalar out of this, not an array >>> >>> If you have an arbitrary number of objects (which is what I suspect you >>> have), then one could easily produce an array of sums (for a list, x, of >>> numpy-compatible objects) like so: >>> > object_sums = [np.sum(anObject) for anObject in x] >>> >>> Performance-wise, it should be no more or less efficient than having numpy >>> somehow produce an array of sums from a single call to sum. >>> Readability-wise, it makes more sense because when you are treating objects >>> separately, a *list* of them is more intuitive than a numpy.array, which is >>> more-or-less treated as a single mathematical entity. >>> >>> I hope that addresses your concerns. >>> >> I appreciate the response, but you are arguing that it is not a >> problem, and I'm certain that it is. It may not be numpy >> > It may not be numpy's problem, I can accept that. But it is definitely > a problem for quantities. I'm trying to determine just how big a > problem it is. I had hoped that one day quantities might become a part > of numpy or scipy, but this appears to be a fundamental issue and it > makes me doubt that inclusion would be appropriate. > > Thank you for the suggestion about calling the sum method instead of > numpy's function. That is a reasonable workaround. > > Darren > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, np.array is an array creating function that numpy.array takes a array_like input and it *will* try to convert that input into an array. (This also occurs when you give np.array a masked array as an input.) This a 'feature' especially when you don't use the dtype argument and applies to any numpy function that takes array_like inputs. I do not quantities, but you either have to get the user to use the appropriate quantities functions or let it remain 'user beware' when they do not use the appropriate functions. In the longer term you have to get numpy to 'do the right thing' with quantities objects. Bruce From ben.root at ou.edu Thu May 20 12:13:37 2010 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 20 May 2010 11:13:37 -0500 Subject: [Numpy-discussion] question about creating numpy arrays In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 10:30 AM, Ryan May wrote: > On Thu, May 20, 2010 at 9:44 AM, Benjamin Root wrote: > >> I gave two counterexamples of why. > > > > The examples you gave aren't counterexamples. See below... > > > > On Wed, May 19, 2010 at 7:06 PM, Darren Dale wrote: > >> > >> On Wed, May 19, 2010 at 4:19 PM, wrote: > >> > On Wed, May 19, 2010 at 4:08 PM, Darren Dale > wrote: > >> >> I have a question about creation of numpy arrays from a list of > >> >> objects, which bears on the Quantities project and also on masked > >> >> arrays: > >> >> > >> >>>>> import quantities as pq > >> >>>>> import numpy as np > >> >>>>> a, b = 2*pq.m,1*pq.s > >> >>>>> np.array([a, b]) > >> >> array([ 12., 1.]) > >> >> > >> >> Why doesn't that create an object array? Similarly: > >> >> > > > > > > Consider the use case of a person creating a 1-D numpy array: > > > np.array([12.0, 1.0]) > > array([ 12., 1.]) > > > > How is python supposed to tell the difference between > > > np.array([a, b]) > > and > > > np.array([12.0, 1.0]) > > ? > > > > It can't, and there are plenty of times when one wants to explicitly > > initialize a small numpy array with a few discrete variables. > > What do you mean it can't? 12.0 and 1.0 are floats, a and b are not. > While, yes, they can be coerced to floats, this is a *lossy* > transformation--it strips away information contained in the class, and > IMHO should not be the default behavior. If I want the objects, I can > force it: > > In [7]: np.array([a,b],dtype=np.object) > Out[7]: array([2.0 m, 1.0 s], dtype=object) > > This works fine, but feels ugly since I have to explicitly tell numpy > not to do something. It feels to me like it's violating the principle > of "in the face of ambiguity, resist the temptation to guess." > I have thought about this further, and I think I am starting to see your point (from both of you). Here are my thoughts: As I understand it, numpy.array() (rather, array_like()) essentially builds the dimensions of the array by first identifying if there is an iterable object, and then if the contents of the iterable is also iterable, until it reaches a non-iterable. Therefore, the question becomes, why is numpy.array() implicitly coercing the non-iterable type into a numeric? Is there some reason that I am not seeing for why there is an implicit coercion? At first glance, I did not see a problem with this behavior, and I have come to expect it (hence my original reply). But now, I am not quite so sure. > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 20 12:21:09 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 20 May 2010 12:21:09 -0400 Subject: [Numpy-discussion] question about creating numpy arrays In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 12:13 PM, Benjamin Root wrote: > > > On Thu, May 20, 2010 at 10:30 AM, Ryan May wrote: >> >> On Thu, May 20, 2010 at 9:44 AM, Benjamin Root wrote: >> >> I gave two counterexamples of why. >> > >> > The examples you gave aren't counterexamples.? See below... >> > >> > On Wed, May 19, 2010 at 7:06 PM, Darren Dale wrote: >> >> >> >> On Wed, May 19, 2010 at 4:19 PM, ? wrote: >> >> > On Wed, May 19, 2010 at 4:08 PM, Darren Dale >> >> > wrote: >> >> >> I have a question about creation of numpy arrays from a list of >> >> >> objects, which bears on the Quantities project and also on masked >> >> >> arrays: >> >> >> >> >> >>>>> import quantities as pq >> >> >>>>> import numpy as np >> >> >>>>> a, b = 2*pq.m,1*pq.s >> >> >>>>> np.array([a, b]) >> >> >> array([ 12., ? 1.]) >> >> >> >> >> >> Why doesn't that create an object array? Similarly: >> >> >> >> > >> > >> > Consider the use case of a person creating a 1-D numpy array: >> > ?> np.array([12.0, 1.0]) >> > array([ 12.,? 1.]) >> > >> > How is python supposed to tell the difference between >> > ?> np.array([a, b]) >> > and >> > ?> np.array([12.0, 1.0]) >> > ? >> > >> > It can't, and there are plenty of times when one wants to explicitly >> > initialize a small numpy array with a few discrete variables. >> >> What do you mean it can't? 12.0 and 1.0 are floats, a and b are not. >> While, yes, they can be coerced to floats, this is a *lossy* >> transformation--it strips away information contained in the class, and >> IMHO should not be the default behavior. If I want the objects, I can >> force it: >> >> In [7]: np.array([a,b],dtype=np.object) >> Out[7]: array([2.0 m, 1.0 s], dtype=object) >> >> This works fine, but feels ugly since I have to explicitly tell numpy >> not to do something. It feels to me like it's violating the principle >> of "in the face of ambiguity, resist the temptation to guess." > > I have thought about this further, and I think I am starting to see your > point (from both of you).? Here are my thoughts: > > As I understand it, numpy.array() (rather, array_like()) essentially builds > the dimensions of the array by first identifying if there is an iterable > object, and then if the contents of the iterable is also iterable, until it > reaches a non-iterable. > > Therefore, the question becomes, why is numpy.array() implicitly coercing > the non-iterable type into a numeric?? Is there some reason that I am not > seeing for why there is an implicit coercion? I think because the dtype is numeric (float), otherwise it wouldn't operate on numbers, and none of the other numerical functions might work (just a guess) >>> a = np.array(['2.0', '1.0'], dtype=object) >>> a array([2.0, 1.0], dtype=object) >>> np.sqrt(a) Traceback (most recent call last): File "", line 1, in np.sqrt(a) AttributeError: sqrt >>> np.array([a,a]) array([[2.0, 1.0], [2.0, 1.0]], dtype=object) >>> 2*a array([2.02.0, 1.01.0], dtype=object) >>> b = np.array(['2.0', '1.0']) >>> np.sqrt(b) NotImplemented >>> np.array([b,b]) array([['2.0', '1.0'], ['2.0', '1.0']], dtype='|S3') >>> 2*b Traceback (most recent call last): File "", line 1, in 2*b TypeError: unsupported operand type(s) for *: 'int' and 'numpy.ndarray' Josef > > At first glance, I did not see a problem with this behavior, and I have come > to expect it (hence my original reply). But now, I am not quite so sure. > >> >> Ryan >> >> -- >> Ryan May >> Graduate Research Assistant >> School of Meteorology >> University of Oklahoma >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From dsdale24 at gmail.com Thu May 20 12:42:20 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Thu, 20 May 2010 12:42:20 -0400 Subject: [Numpy-discussion] question about creating numpy arrays In-Reply-To: <4BF55E3C.2000907@gmail.com> References: <4BF55E3C.2000907@gmail.com> Message-ID: On Thu, May 20, 2010 at 12:07 PM, Bruce Southey wrote: > np.array is an array creating function that numpy.array takes a > array_like input and it *will* try to convert that input into an array. > (This also occurs when you give np.array a masked array as an input.) > This a 'feature' especially when you don't use the dtype argument and > applies to any numpy function that takes array_like inputs. Ok. I can accept that. > I do not quantities, but you either have to get the user to use the > appropriate quantities functions or let it remain 'user beware' when > they do not use the appropriate functions. In the longer term you have > to get numpy to 'do the right thing' with quantities objects. I have done a bit of development on numpy to try to extend the __array_wrap__ mechanism so quantities could tell numpy how to do the right thing in many situations. That has been largely successful, but this issue we are discussing is demonstrating some unanticipated limitations. You may be right that this is a "user-beware" situation, since in this case there appears to be no way for an ndarray subclass to step in and influence what numpy will do with a list of those instances. Darren From kwgoodman at gmail.com Thu May 20 16:04:11 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 20 May 2010 13:04:11 -0700 Subject: [Numpy-discussion] dtype and array creation Message-ID: Why do the follow expressions give different dtype? >> np.array([1, 2, 3], dtype=str) array(['1', '2', '3'], dtype='|S1') >> np.array(np.array([1, 2, 3]), dtype=str) array(['1', '2', '3'], dtype='|S8') From josef.pktd at gmail.com Thu May 20 16:19:49 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 20 May 2010 16:19:49 -0400 Subject: [Numpy-discussion] dtype and array creation In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 4:04 PM, Keith Goodman wrote: > Why do the follow expressions give different dtype? > >>> np.array([1, 2, 3], dtype=str) > array(['1', '2', '3'], > ? ? ?dtype='|S1') >>> np.array(np.array([1, 2, 3]), dtype=str) > array(['1', '2', '3'], > ? ? ?dtype='|S8') you're on a 64bit machine? S8 is the same size as the float >>> np.array([8]).itemsize 4 >>> np.array(np.array([1, 2, 3]), dtype=str) array(['1', '2', '3'], dtype='|S4') >>> np.array([8]).view(dtype='S4') array(['\x08'], dtype='|S4') >>> np.array([8]).view(dtype='S1') array(['\x08', '', '', ''], dtype='|S1') But I don't know whether this is a desired feature, numpy might reuse the existing buffer (?) Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Thu May 20 16:21:40 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 20 May 2010 16:21:40 -0400 Subject: [Numpy-discussion] dtype and array creation In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 4:19 PM, wrote: > On Thu, May 20, 2010 at 4:04 PM, Keith Goodman wrote: >> Why do the follow expressions give different dtype? >> >>>> np.array([1, 2, 3], dtype=str) >> array(['1', '2', '3'], >> ? ? ?dtype='|S1') >>>> np.array(np.array([1, 2, 3]), dtype=str) >> array(['1', '2', '3'], >> ? ? ?dtype='|S8') > > you're on a 64bit machine? > > S8 is the same size as the float not float, it should be int, here is float on my Win32: >>> np.array(np.array([1., 2, 3]), dtype=str) array(['1.0', '2.0', '3.0'], dtype='|S8') >>> np.array([8.]).itemsize 8 > > >>>> np.array([8]).itemsize > 4 >>>> np.array(np.array([1, 2, 3]), dtype=str) > array(['1', '2', '3'], > ? ? ?dtype='|S4') >>>> np.array([8]).view(dtype='S4') > array(['\x08'], > ? ? ?dtype='|S4') >>>> np.array([8]).view(dtype='S1') > array(['\x08', '', '', ''], > ? ? ?dtype='|S1') > > But I don't know whether this is a desired feature, numpy might reuse > the existing buffer (?) > > Josef > > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > From kwgoodman at gmail.com Thu May 20 16:28:42 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 20 May 2010 13:28:42 -0700 Subject: [Numpy-discussion] dtype and array creation In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 1:19 PM, wrote: > On Thu, May 20, 2010 at 4:04 PM, Keith Goodman wrote: >> Why do the follow expressions give different dtype? >> >>>> np.array([1, 2, 3], dtype=str) >> array(['1', '2', '3'], >> ? ? ?dtype='|S1') >>>> np.array(np.array([1, 2, 3]), dtype=str) >> array(['1', '2', '3'], >> ? ? ?dtype='|S8') > > you're on a 64bit machine? > > S8 is the same size as the float > > >>>> np.array([8]).itemsize > 4 >>>> np.array(np.array([1, 2, 3]), dtype=str) > array(['1', '2', '3'], > ? ? ?dtype='|S4') >>>> np.array([8]).view(dtype='S4') > array(['\x08'], > ? ? ?dtype='|S4') >>>> np.array([8]).view(dtype='S1') > array(['\x08', '', '', ''], > ? ? ?dtype='|S1') > > But I don't know whether this is a desired feature, numpy might reuse > the existing buffer (?) Yes, I'm on a 64-bit machine. That's what I thought so I tried this: >> a = np.array([1, 2, 3]) >> type(a[0]) >> np.array([a[0], a[1], a[2]], dtype=str) array(['1', '2', '3'], dtype='|S1') But it gives '|S1' too. I guess I'm lost. From josef.pktd at gmail.com Thu May 20 16:42:02 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 20 May 2010 16:42:02 -0400 Subject: [Numpy-discussion] dtype and array creation In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 4:28 PM, Keith Goodman wrote: > On Thu, May 20, 2010 at 1:19 PM, ? wrote: >> On Thu, May 20, 2010 at 4:04 PM, Keith Goodman wrote: >>> Why do the follow expressions give different dtype? >>> >>>>> np.array([1, 2, 3], dtype=str) >>> array(['1', '2', '3'], >>> ? ? ?dtype='|S1') >>>>> np.array(np.array([1, 2, 3]), dtype=str) >>> array(['1', '2', '3'], >>> ? ? ?dtype='|S8') >> >> you're on a 64bit machine? >> >> S8 is the same size as the float >> >> >>>>> np.array([8]).itemsize >> 4 >>>>> np.array(np.array([1, 2, 3]), dtype=str) >> array(['1', '2', '3'], >> ? ? ?dtype='|S4') >>>>> np.array([8]).view(dtype='S4') >> array(['\x08'], >> ? ? ?dtype='|S4') >>>>> np.array([8]).view(dtype='S1') >> array(['\x08', '', '', ''], >> ? ? ?dtype='|S1') >> >> But I don't know whether this is a desired feature, numpy might reuse >> the existing buffer (?) > > Yes, I'm on a 64-bit machine. > > That's what I thought so I tried this: > >>> a = np.array([1, 2, 3]) >>> type(a[0]) > ? >>> np.array([a[0], a[1], a[2]], dtype=str) > array(['1', '2', '3'], > ? ? ?dtype='|S1') > > But it gives '|S1' too. I guess I'm lost. for sure it doesn't look very consistent, special treatment of 0-dim ? >>> np.array(a[0], dtype=str) array('1', dtype='|S1') >>> np.array(a[:1], dtype=str) array(['1'], dtype='|S4') >>> a[:1].shape (1,) >>> a[0].shape () Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Thu May 20 18:28:42 2010 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 20 May 2010 22:28:42 +0000 (UTC) Subject: [Numpy-discussion] dtype and array creation References: Message-ID: Thu, 20 May 2010 13:04:11 -0700, Keith Goodman wrote: > Why do the follow expressions give different dtype? > >>> np.array([1, 2, 3], dtype=str) > array(['1', '2', '3'], > dtype='|S1') >>> np.array(np.array([1, 2, 3]), dtype=str) > array(['1', '2', '3'], > dtype='|S8') Scalars seem to be handled specially. Anyway, automatic determination of the string size is a bit dangerous to rely on with non-strings in the array: >>> np.array([np.array(12345)], dtype=str) array(['1234'], dtype='|S4') When I looked at this the last time, it wasn't completely obvious how to make this to do something more sensible. -- Pauli Virtanen From steven.nien at gmail.com Thu May 20 19:43:33 2010 From: steven.nien at gmail.com (Steven Nien) Date: Fri, 21 May 2010 07:43:33 +0800 Subject: [Numpy-discussion] Problem in swig In-Reply-To: References: Message-ID: Hi Thanks for all your help! I found the "strange" result was caused by my algorithm(CUDA). The swig part is very ok:) On Thu, May 20, 2010 at 11:18 PM, Bill Spotz wrote: > I tried the following: > > %module example > %{ > #define SWIG_FILE_WITH_INIT > //#include "example.h" > %} > %include "numpy.i" > > %init %{ > import_array(); > %} > > %apply (float* INPLACE_ARRAY1, int DIM1) {(float *a, int na), (float > *b, int nb)}; > > %inline > { > void update_incident_field(int k, float *a, int na, float *b, int nb) > { > for (i=0; i { > a[i] = a[i] + b[i] * k; > } > } > } > > $ swig -python example.i > > and the resulting example_wrap.c file looks OK for me. What strange > output did you get? > > On May 19, 2010, at 7:43 PM, Steven Nien wrote: > > > I want to pass 1 integer and 2 numpy array into the C function as > > followings: > > > > void update_incident_field(int k, float *a, int na, float *b, int > > nb) { > > for (i=0; i > a[i] = a[i] + b[i] * k; > > } > > } > > > > But I don't know how to write the interface code (***.i) > > Can someone help me? > > > > Thanks! > > > > The swig interface code I written(did't work, strange output) > > > > %module example > > %{ > > #define SWIG_FILE_WITH_INIT > > #include "example.h" > > %} > > %include "numpy.i" > > > > %init %{ > > import_array(); > > %} > > > > %apply (float* INPLACE_ARRAY1, int DIM1) {(float *a, int na), (float > > *b, int nb)}; > > > > %include "example.h" > > > > > > > > > > ** Bill Spotz ** > ** Sandia National Laboratories Voice: (505)845-0170 ** > ** P.O. Box 5800 Fax: (505)284-0154 ** > ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** > > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Thu May 20 21:00:39 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 20 May 2010 18:00:39 -0700 Subject: [Numpy-discussion] astype None Message-ID: While automating some unit tests for my labeled array class, larry, I assumed that np.array([1, 2], dtype=dtype) would give the same result as np.array([1, 2]).astype(dtype) But it doesn't for dtype=None: >> np.array([1, 2, 3], dtype=None) array([1, 2, 3]) >> np.array([1, 2, 3]).astype(None) array([ 1., 2., 3.]) I prefer the behavior of array where dtype=None is a no-op. From kwgoodman at gmail.com Thu May 20 22:21:21 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 20 May 2010 19:21:21 -0700 Subject: [Numpy-discussion] Numpy doc string license Message-ID: I'd like to include modified numpy doc strings in my package. Do I just put a note in my license file that says my package contains numpy doc strings and then paste in the numpy license? My package is distributed under a Simplifed BSD license, if that matters. From josef.pktd at gmail.com Thu May 20 22:36:04 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 20 May 2010 22:36:04 -0400 Subject: [Numpy-discussion] astype None In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 9:00 PM, Keith Goodman wrote: > While automating some unit tests for my labeled array class, larry, I > assumed that > > np.array([1, 2], dtype=dtype) > > would give the same result as > > np.array([1, 2]).astype(dtype) > > But it doesn't for dtype=None: > >>> np.array([1, 2, 3], dtype=None) > ? array([1, 2, 3]) >>> np.array([1, 2, 3]).astype(None) > ? array([ 1., ?2., ?3.]) > > I prefer the behavior of array where dtype=None is a no-op. Since nobody who knows this answered, I try my explanation It's all in the docs astype(None) cast to a specified type here the dtype is "None" None is by default float_ >>> np.dtype(None) dtype('float64') np.array([1, 2, 3], dtype=None) np.asarray([1, 2, 3], dtype=None) here dtype is a keyword argument where None is not a dtype but triggers the default, which is: dtype : data-type, optional By default, the data-type is inferred from the input data. Shall we start a list of inconsistent looking corner cases ?) Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Thu May 20 22:38:09 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 May 2010 21:38:09 -0500 Subject: [Numpy-discussion] Numpy doc string license In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 21:21, Keith Goodman wrote: > I'd like to include modified numpy doc strings in my package. Do I > just put a note in my license file that says my package contains numpy > doc strings and then paste in the numpy license? My package is > distributed under a Simplifed BSD license, if that matters. That should work just fine. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Thu May 20 22:52:20 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 20 May 2010 19:52:20 -0700 Subject: [Numpy-discussion] astype None In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 7:36 PM, wrote: > On Thu, May 20, 2010 at 9:00 PM, Keith Goodman wrote: >> While automating some unit tests for my labeled array class, larry, I >> assumed that >> >> np.array([1, 2], dtype=dtype) >> >> would give the same result as >> >> np.array([1, 2]).astype(dtype) >> >> But it doesn't for dtype=None: >> >>>> np.array([1, 2, 3], dtype=None) >> ? array([1, 2, 3]) >>>> np.array([1, 2, 3]).astype(None) >> ? array([ 1., ?2., ?3.]) >> >> I prefer the behavior of array where dtype=None is a no-op. > > Since nobody who knows this answered, I try my explanation > > It's all in the docs > > astype(None) ?cast to a specified type > > here the dtype is "None" > None is by default float_ >>>> np.dtype(None) > dtype('float64') > > np.array([1, 2, 3], dtype=None) > np.asarray([1, 2, 3], dtype=None) > > here dtype is a keyword argument where None is not a dtype but > triggers the default, which is: > dtype : data-type, optional > By default, the data-type is inferred from the input data. > > Shall we start a list of inconsistent looking corner cases ?) It's easy to find this sort of stuff with short nose tests. Here's a quick hacked example: import numpy as np from numpy.testing import assert_equal def test_astype_dtype(): "array.astype test" dtypes = [float, int, str, bool, complex, object, None] seqs = ([0, 1], [1.0, 2.0]) msg1 = 'arrays failed on dtype %s and sequence %s' msg2 = 'dtype are different when dtype=%s and seq=%s' for dtype in dtypes: for seq in seqs: ar1 = np.array(list(seq), dtype=dtype) # array does dtype ar2 = np.array(list(seq)).astype(dtype) # astype does dtype yield assert_equal, ar1, ar2, msg1 % (dtype, seq) yield assert_equal, ar1.dtype, ar2.dtype, msg2 % (dtype, seq) The output is =================== FAIL: array.astype test -------------------------------------------- <...> AssertionError: Items are not equal: dtype are different when dtype=None and seq=[0, 1] ACTUAL: dtype('int64') DESIRED: dtype('float64') From kwgoodman at gmail.com Thu May 20 23:24:02 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 20 May 2010 20:24:02 -0700 Subject: [Numpy-discussion] Numpy doc string license In-Reply-To: References: Message-ID: On Thu, May 20, 2010 at 7:38 PM, Robert Kern wrote: > On Thu, May 20, 2010 at 21:21, Keith Goodman wrote: >> I'd like to include modified numpy doc strings in my package. Do I >> just put a note in my license file that says my package contains numpy >> doc strings and then paste in the numpy license? My package is >> distributed under a Simplifed BSD license, if that matters. > > That should work just fine. Thanks, Robert. BTW, time to bump the NumPy license date to 2010: Copyright (c) 2005-2009, NumPy Developers. From tjhnson at gmail.com Fri May 21 11:09:55 2010 From: tjhnson at gmail.com (T J) Date: Fri, 21 May 2010 08:09:55 -0700 Subject: [Numpy-discussion] Build failure at rev8246 Message-ID: Hi, I tried upgrading today and had trouble building numpy (after rm -rf build). My full build log is here: http://www.filedump.net/dumped/build1274454454.txt If someone can point me in the right direction, I'd appreciate it very much. To excerpts from the log file: Running from numpy source directory.numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch detected, the C API version numbers have to be updated. Current C api version is 4, with checksum 59750b518272c8987f02d66445afd3f1, but recorded checksum for C API version 4 in codegen_dir/cversions.txt is 3d8940bf7b0d2a4e25be4338c14c3c85. If functions were added in the C API, you have to update C_API_VERSION in numpy/core/setup_common.pyc. MismatchCAPIWarning) In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:36: numpy/core/src/multiarray/buffer.c: At top level: numpy/core/src/multiarray/buffer.c:715: error: conflicting types for ?_descriptor_from_pep3118_format? numpy/core/src/multiarray/common.c:221: note: previous implicit declaration of ?_descriptor_from_pep3118_format? was here numpy/core/src/multiarray/buffer.c: In function ?_descriptor_from_pep3118_format?: numpy/core/src/multiarray/buffer.c:751: warning: assignment from incompatible pointer type In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:45: numpy/core/src/multiarray/multiarraymodule.c: In function ?initmultiarray?: numpy/core/src/multiarray/multiarraymodule.c:3062: warning: implicit declaration of function ?_numpymemoryview_init? error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include -Ibuild/src.linux-i686-2.6/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.6 -Ibuild/src.linux-i686-2.6/numpy/core/src/multiarray -Ibuild/src.linux-i686-2.6/numpy/core/src/umath -c numpy/core/src/multiarray/multiarraymodule_onefile.c -o build/temp.linux-i686-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o" failed with exit status 1 Hope this helps. From pav at iki.fi Fri May 21 11:51:48 2010 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 21 May 2010 15:51:48 +0000 (UTC) Subject: [Numpy-discussion] Build failure at rev8246 References: Message-ID: Fri, 21 May 2010 08:09:55 -0700, T J wrote: > I tried upgrading today and had trouble building numpy (after rm -rf > build). My full build log is here: > > http://www.filedump.net/dumped/build1274454454.txt > Your SVN checkout might be corrupted, containing a mix of old and new files. Try building from a clean checkout. -- Pauli Virtanen From tjhnson at gmail.com Fri May 21 12:06:41 2010 From: tjhnson at gmail.com (T J) Date: Fri, 21 May 2010 09:06:41 -0700 Subject: [Numpy-discussion] Build failure at rev8246 In-Reply-To: References: Message-ID: On Fri, May 21, 2010 at 8:51 AM, Pauli Virtanen wrote: > Fri, 21 May 2010 08:09:55 -0700, T J wrote: >> I tried upgrading today and had trouble building numpy (after rm -rf >> build). ?My full build log is here: >> >> ? ? http://www.filedump.net/dumped/build1274454454.txt >> > > Your SVN checkout might be corrupted, containing a mix of old and new > files. Try building from a clean checkout. > That was it! When I did an svn update, I noticed conflicts all over the place and accepted with "tf" but I guess that was not enough. Given that I haven't really touched this directory, what was the likely cause of this corruption? The last time I "svn update"'d was a few months ago. From matt.fearon at agsemail.com Fri May 21 14:55:15 2010 From: matt.fearon at agsemail.com (Matt Fearon) Date: Fri, 21 May 2010 14:55:15 -0400 Subject: [Numpy-discussion] calling C function from Python via f2py Message-ID: Hello, I am trying to use f2py to generate a wrapped C function that I can call from Python (passing arguments to and from). I have this almost working, but I receive trouble with "exp and pow" related to C and some "pos (2) error" with one of my passed variables. My f2py syntax is: f2py -c -lm FFMCcalc.pyf FFMCcalc.c Also, my 3 scripts are short and attached. 1. FFMCcalc.c, C function 2. FFMCcalc.pyf, wrapper file 3. test.py, short python code that calls C function Any advice would greatly appreciated to get this working. thanks, Matt -------------- next part -------------- A non-text attachment was scrubbed... Name: FFMCcalc.c Type: text/x-csrc Size: 1666 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: FFMCcalc.pyf Type: application/octet-stream Size: 604 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test.py Type: text/x-python Size: 138 bytes Desc: not available URL: From matthewturk at gmail.com Fri May 21 16:13:32 2010 From: matthewturk at gmail.com (Matthew Turk) Date: Fri, 21 May 2010 13:13:32 -0700 Subject: [Numpy-discussion] Summation of large float32/float64 arrays Message-ID: Hi all, I have a possibly naive question. I don't really understand this particular set of output: In [1]: import numpy In [2]: a1 = numpy.random.random((512,512,512)).astype("float32") In [3]: a1.sum(axis=0).sum(axis=0).sum(axis=0) Out[3]: 67110312.0 In [4]: a1.sum() Out[4]: 16777216.0 I recognize that the intermediate sums may accumulate error differently than a single call to .sum(), but I guess my concern is that it's accumulating a lot faster than I anticipated. (Interesting to note that a1.sum() returns 0.5*512^3, down to the decimal; is it summing up the mean, which should be ~0.5?) However, with a 256^3 array: In [1]: import numpy In [2]: a1 = numpy.random.random((256,256,256)).astype("float32") In [3]: a1.sum(axis=0).sum(axis=0).sum(axis=0) Out[3]: 8389703.0 In [4]: a1.sum() Out[4]: 8389245.0 The errors are much more reasonable. Is there an overflow or something that occurs with the 512^3? These problems all go completely away with a float64 array, but the issue originally showed up when trying to normalize an on-disk float32 array of size 512^3, where the normalization factor was off by a substantial factor (>2x) depending on the mechanism used to sum. My suspicion is that perhaps I have a naive misconception about intermediate steps in summations, or there is a subtlety I'm missing here. I placed a sample script I used to test this here: http://pastebin.com/dGbHwFPK Thanks for any insight anybody can provide, Matt From robert.kern at gmail.com Fri May 21 16:26:47 2010 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 21 May 2010 15:26:47 -0500 Subject: [Numpy-discussion] Summation of large float32/float64 arrays In-Reply-To: References: Message-ID: On Fri, May 21, 2010 at 15:13, Matthew Turk wrote: > Hi all, > > I have a possibly naive question. ?I don't really understand this > particular set of output: > > In [1]: import numpy > > In [2]: a1 = numpy.random.random((512,512,512)).astype("float32") > > In [3]: a1.sum(axis=0).sum(axis=0).sum(axis=0) > Out[3]: 67110312.0 > > In [4]: a1.sum() > Out[4]: 16777216.0 > > I recognize that the intermediate sums may accumulate error > differently than a single call to .sum(), but I guess my concern is > that it's accumulating a lot faster than I anticipated. ?(Interesting > to note that a1.sum() returns 0.5*512^3, down to the decimal; is it > summing up the mean, which should be ~0.5?) ?However, with a 256^3 > array: > > In [1]: import numpy > > In [2]: a1 = numpy.random.random((256,256,256)).astype("float32") > > In [3]: a1.sum(axis=0).sum(axis=0).sum(axis=0) > Out[3]: 8389703.0 > > In [4]: a1.sum() > Out[4]: 8389245.0 > > The errors are much more reasonable. ?Is there an overflow or > something that occurs with the 512^3? It's not quite an overflow. In [1]: from numpy import * In [2]: x = float32(16777216.0) In [3]: x + float32(0.9) Out[3]: 16777216.0 You are accumulating your result in a float32. With the a.sum() approach, you eventually hit a level where the next number to add is always less than the relative epsilon of float32 precision. So the result doesn't change. And will never change again as long as you only add one number at a time. Summing along the other axes creates smaller intermediate sums such that you are usually adding together numbers roughly in the same regime as each other, so you don't lose as much precision. Use a.sum(dtype=np.float64) to use a float64 accumulator. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matthewturk at gmail.com Fri May 21 16:32:09 2010 From: matthewturk at gmail.com (Matthew Turk) Date: Fri, 21 May 2010 13:32:09 -0700 Subject: [Numpy-discussion] Summation of large float32/float64 arrays In-Reply-To: References: Message-ID: Hi Robert, > It's not quite an overflow. > > In [1]: from numpy import * > > In [2]: x = float32(16777216.0) > > In [3]: x + float32(0.9) > Out[3]: 16777216.0 > > You are accumulating your result in a float32. With the a.sum() > approach, you eventually hit a level where the next number to add is > always less than the relative epsilon of float32 precision. So the > result doesn't change. And will never change again as long as you only > add one number at a time. Summing along the other axes creates smaller > intermediate sums such that you are usually adding together numbers > roughly in the same regime as each other, so you don't lose as much > precision. Thank you very much for that explanation; that completely makes sense. > > Use a.sum(dtype=np.float64) to use a float64 accumulator. > I didn't know about the dtype for accumulators/operators -- that did just the trick. Much obliged, Matt From aisaac at american.edu Fri May 21 17:30:47 2010 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 21 May 2010 17:30:47 -0400 Subject: [Numpy-discussion] Summation of large float32/float64 arrays In-Reply-To: References: Message-ID: <4BF6FB87.5030606@american.edu> On 5/21/2010 4:13 PM, Matthew Turk wrote: > a1 = numpy.random.random((512,512,512)).astype("float32") This consistently gives me a "MemoryError". I believe I have plenty of physical memory. (That should require about 1.3G during creation, right? I have 8G.) It seems I'm hitting some kind of 1G memory use limit.How to think about this? I can create the initial 64 bit array no problem. However I cannot create the second 32 bit array, despite having plenty of physical memory. The copy method also fails; or even creating a second array the same size fails, unless I first delete `a1`. I realize this is probably a naive and operating system dependent question. Apologies if it is off topic. Thanks, Alan Isaac (running 32bit Python 2.6.5 on 64bit Vista; NumPy version import 1.4.1rc2) From cgohlke at uci.edu Fri May 21 20:34:32 2010 From: cgohlke at uci.edu (Christoph Gohlke) Date: Fri, 21 May 2010 17:34:32 -0700 Subject: [Numpy-discussion] Summation of large float32/float64 arrays In-Reply-To: <4BF6FB87.5030606@american.edu> References: <4BF6FB87.5030606@american.edu> Message-ID: <4BF72698.7030606@uci.edu> On 5/21/2010 2:30 PM, Alan G Isaac wrote: > On 5/21/2010 4:13 PM, Matthew Turk wrote: >> a1 = numpy.random.random((512,512,512)).astype("float32") > > > This consistently gives me a "MemoryError". > I believe I have plenty of physical memory. > (That should require about 1.3G during creation, right? > I have 8G.) It seems I'm hitting some kind of 1G > memory use limit.How to think about this? > > I can create the initial 64 bit array no problem. > However I cannot create the second 32 bit array, > despite having plenty of physical memory. > The copy method also fails; or even creating a > second array the same size fails, unless I first > delete `a1`. > > I realize this is probably a naive and operating system > dependent question. Apologies if it is off topic. > > Thanks, > Alan Isaac > (running 32bit Python 2.6.5 on 64bit Vista; > NumPy version import 1.4.1rc2) > The MemoryError is not not unexpected. First, the 32-bit Python interpreter can only use 2 GB of your memory. Second, numpy arrays need a contiguous, un-fragmented, range of memory to be available within that 2 GB address space. In this example there is no contiguous 512 MB space left after creating the 1 GB array. Practically it does not seem possible to allocate a single array larger than about 1.3 GB on win32. The solution is to use a 64-bit version of Python and numpy. Christoph From bergstrj at iro.umontreal.ca Fri May 21 21:10:12 2010 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Fri, 21 May 2010 21:10:12 -0400 Subject: [Numpy-discussion] Bug in frompyfunc starting at 10000 elements? Message-ID: Hi all, I'm wondering if this is a bug... Something strange happens with my ufunc as soon as I use 10000 elements. As the test shows, the ufunc computes the correct result for either the first or last 9999 elements, but both at the same time is no good. Turns out I'm only running numpy 1.3.0 with Python 2.6.4... could someone with a more recent installation maybe check to see if this has been fixed? Thanks, def test_ufunc(): np = numpy rng = np.random.RandomState(2342) a = rng.randn(10000, 2) b = rng.randn(10000, 1) f = lambda x,y:x*y ufunc = np.frompyfunc(lambda *x:numpy.prod(x), 2, 1) def g(x,y): return np.asarray(ufunc(x,y), dtype='float64') assert numpy.allclose(f(a[:-1],b[:-1]), g(a[:-1],b[:-1])) # PASS assert numpy.allclose(f(a[1:],b[1:]), g(a[1:],b[1:])) # PASS assert numpy.allclose(f(a,b), g(a,b)) # FAIL -- http://www-etud.iro.umontreal.ca/~bergstrj -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Fri May 21 21:22:41 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 21 May 2010 21:22:41 -0400 (EDT) Subject: [Numpy-discussion] Bug in frompyfunc starting at 10000 elements? In-Reply-To: References: Message-ID: Confirmed in NumPy 1.4.1, Py 2.6.5. David On Fri, 21 May 2010, James Bergstra wrote: > Hi all, I'm wondering if this is a bug... > > Something strange happens with my ufunc as soon as I use 10000 elements. As > the test shows, the ufunc computes the correct result for either the first > or last 9999 elements, but both at the same time is no good. > > Turns out I'm only running numpy 1.3.0 with Python 2.6.4... could someone > with a more recent installation maybe check to see if this has been fixed? > > Thanks, > > def test_ufunc(): > np = numpy > > rng = np.random.RandomState(2342) > a = rng.randn(10000, 2) > b = rng.randn(10000, 1) > > > f = lambda x,y:x*y > ufunc = np.frompyfunc(lambda *x:numpy.prod(x), 2, 1) > > def g(x,y): > return np.asarray(ufunc(x,y), dtype='float64') > > > assert numpy.allclose(f(a[:-1],b[:-1]), g(a[:-1],b[:-1])) > # PASS > assert numpy.allclose(f(a[1:],b[1:]), g(a[1:],b[1:])) # PASS > assert numpy.allclose(f(a,b), g(a,b)) # FAIL > > > -- > http://www-etud.iro.umontreal.ca/~bergstrj > From nadavh at visionsense.com Sun May 23 01:44:38 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun, 23 May 2010 08:44:38 +0300 Subject: [Numpy-discussion] calling C function from Python via f2py References: Message-ID: <710F2847B0018641891D9A21602763605AD40E@ex3.envision.co.il> in test.py change to print FFMCcalc.FFMCcalc(T,H,W,ro,Fo) As implied from the line print FFMCcalc.FFMCcalc.__doc__ Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Matt Fearon Sent: Fri 21-May-10 21:55 To: numpy-discussion at scipy.org Subject: [Numpy-discussion] calling C function from Python via f2py Hello, I am trying to use f2py to generate a wrapped C function that I can call from Python (passing arguments to and from). I have this almost working, but I receive trouble with "exp and pow" related to C and some "pos (2) error" with one of my passed variables. My f2py syntax is: f2py -c -lm FFMCcalc.pyf FFMCcalc.c Also, my 3 scripts are short and attached. 1. FFMCcalc.c, C function 2. FFMCcalc.pyf, wrapper file 3. test.py, short python code that calls C function Any advice would greatly appreciated to get this working. thanks, Matt -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3091 bytes Desc: not available URL: From nadavh at visionsense.com Sun May 23 03:40:34 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun, 23 May 2010 10:40:34 +0300 Subject: [Numpy-discussion] Can not compile numpy with python2.7 onl linux Message-ID: <710F2847B0018641891D9A21602763605AD410@ex3.envision.co.il> I think that line 3405 in _capi.c (svn 8386) should be: #if PY_VERSION_HEX >= 0x03010000 (At least it looks reasonable considering line 3375, and it solves my problem) Nadav From charlesr.harris at gmail.com Sun May 23 13:48:34 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 23 May 2010 11:48:34 -0600 Subject: [Numpy-discussion] Can not compile numpy with python2.7 onl linux In-Reply-To: <710F2847B0018641891D9A21602763605AD410@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD410@ex3.envision.co.il> Message-ID: On Sun, May 23, 2010 at 1:40 AM, Nadav Horesh wrote: > > I think that line 3405 in _capi.c (svn 8386) > should be: > > #if PY_VERSION_HEX >= 0x03010000 > > > (At least it looks reasonable considering line 3375, and it solves my > problem) > > Does the following work? PyCObject is deprecated in 2.7. #if PY_VERSION_HEX >= 0x03010000 m = PyModule_Create(&moduledef); #else m = Py_InitModule("_capi", _libnumarrayMethods); #if PY_VERSION_HEX >= 0x02070000 c_api_object = PyCapsule_New((void *)libnumarray_API, NULL, NULL); if (c_api_object == NULL) { PyErr_Clear(); } #else c_api_object = PyCObject_FromVoidPtr((void *)libnumarray_API, NULL); #endif Chuck > Nadav > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sun May 23 16:18:48 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 23 May 2010 13:18:48 -0700 Subject: [Numpy-discussion] The labeled array package at a glance Message-ID: For those not familiar with the la package and its labeled array, larry, here's a table that gives you a quick overview: http://bazaar.launchpad.net/~kwgoodman/larry/trunk/annotate/head:/doc/source/intro.rst#L120 From aisaac at american.edu Sun May 23 21:04:47 2010 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 23 May 2010 21:04:47 -0400 Subject: [Numpy-discussion] Summation of large float32/float64 arrays In-Reply-To: <4BF72698.7030606@uci.edu> References: <4BF6FB87.5030606@american.edu> <4BF72698.7030606@uci.edu> Message-ID: <4BF9D0AF.4010702@american.edu> On 5/21/2010 8:34 PM, Christoph Gohlke wrote: > the 32-bit Python > interpreter can only use 2 GB of your memory Why? >>> 2**32/1e9 4.2949672960000003 Thanks, Alan From cgohlke at uci.edu Sun May 23 21:33:40 2010 From: cgohlke at uci.edu (Christoph Gohlke) Date: Sun, 23 May 2010 18:33:40 -0700 Subject: [Numpy-discussion] Summation of large float32/float64 arrays In-Reply-To: <4BF9D0AF.4010702@american.edu> References: <4BF6FB87.5030606@american.edu> <4BF72698.7030606@uci.edu> <4BF9D0AF.4010702@american.edu> Message-ID: <4BF9D774.9030007@uci.edu> On 5/23/2010 6:04 PM, Alan G Isaac wrote: > On 5/21/2010 8:34 PM, Christoph Gohlke wrote: >> the 32-bit Python >> interpreter can only use 2 GB of your memory > > Why? > > >>> 2**32/1e9 > 4.2949672960000003 > Because 2 GB is the memory limit for 32-bit processes running in user-mode under 64-bit Windows, unless the executable was specifically built with 'IMAGE_FILE_LARGE_ADDRESS_AWARE' set. See From aisaac at american.edu Sun May 23 22:33:27 2010 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 23 May 2010 22:33:27 -0400 Subject: [Numpy-discussion] Summation of large float32/float64 arrays In-Reply-To: <4BF9D774.9030007@uci.edu> References: <4BF6FB87.5030606@american.edu> <4BF72698.7030606@uci.edu> <4BF9D0AF.4010702@american.edu> <4BF9D774.9030007@uci.edu> Message-ID: <4BF9E577.2020508@american.edu> On 5/23/2010 9:33 PM, Christoph Gohlke wrote: > 2 GB is the memory limit for 32-bit processes running in > user-mode under 64-bit Windows, unless the executable was specifically > built with 'IMAGE_FILE_LARGE_ADDRESS_AWARE' set. > > See Thanks! Alan From amirnntp at gmail.com Mon May 24 03:57:00 2010 From: amirnntp at gmail.com (Amir) Date: Mon, 24 May 2010 00:57:00 -0700 Subject: [Numpy-discussion] building numpy against Cray xt-libsci Message-ID: I am trying to build numpy against Cray's xt-libsci library on a Cray XT5. I am getting an error I am hoping for hints on how to resolve: In [1]: import numpy 20 isfinite, size 21 from numpy.lib import triu ---> 22 from numpy.linalg import lapack_lite 23 from numpy.matrixlib.defmatrix import matrix_power 24 ImportError: /opt/xt-libsci/10.4.0/gnu/lib/libsci.so: undefined symbol: fftw_version These are the symbols in libsci: % nm /opt/xt-libsci/10.4.0/gnu/lib/libsci.so | grep fftw_version 00000000010f9a30 B __crafft_internal__crafft_fftw_version_num U fftw_version 00000000005aa8a4 T get_fftw_version I first built numpy with no custom site.cfg file. It built correctly and all tests ran. But it was too slow. Then I tried building numpy against libsci, which has BLAS, LAPACK, FFTW3 among other things. I had to build a libcblas.a from the netlib src as libsci does not have cblas (using gcc, gfortran 4.3.3). Here is my site.cfg, accumulated from several nice tutorials on how to build numpy on these machines, which for some reason don't work for me. The instructions were based on numpy 1.2. [blas] blas_libs = cblas library_dirs = /global/homes/amir/local/lib [lapack] lapack_libs = sci library_dirs = /opt/xt-libsci/10.4.0/gnu/lib [blas_opt] blas_libs = cblas, sci libraries = cblas, sci [lapack_opt] libraries = sci [fftw] libraries = fftw3 Here is what is linked to lapack_lite.so: % ldd ./numpy/linalg/lapack_lite.so libsci.so => /opt/xt-libsci/10.4.0/gnu/lib/libsci.so (0x00002b4493325000) libgfortran.so.3 => /opt/gcc/4.3.3/snos/lib64/libgfortran.so.3 (0x00002b44a4579000) libm.so.6 => /lib64/libm.so.6 (0x00002b44a4770000) libgcc_s.so.1 => /opt/gcc/4.3.3/snos/lib64/libgcc_s.so.1 (0x00002b44a48c6000) libc.so.6 => /lib64/libc.so.6 (0x00002b44a49dd000) /lib64/ld-linux-x86-64.so.2 (0x0000555555554000) Thanks, Amir. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Mon May 24 07:23:33 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 24 May 2010 14:23:33 +0300 Subject: [Numpy-discussion] Can not compile numpy with python2.7 onllinux References: <710F2847B0018641891D9A21602763605AD410@ex3.envision.co.il> Message-ID: <710F2847B0018641891D9A21602763605AD412@ex3.envision.co.il> That it, you just have to add the missing #endif after m = Py_InitModule ..... Thank you, Nadav. -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Charles R Harris Sent: Sun 23-May-10 20:48 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Can not compile numpy with python2.7 onllinux On Sun, May 23, 2010 at 1:40 AM, Nadav Horesh wrote: > > I think that line 3405 in _capi.c (svn 8386) > should be: > > #if PY_VERSION_HEX >= 0x03010000 > > > (At least it looks reasonable considering line 3375, and it solves my > problem) > > Does the following work? PyCObject is deprecated in 2.7. #if PY_VERSION_HEX >= 0x03010000 m = PyModule_Create(&moduledef); #else m = Py_InitModule("_capi", _libnumarrayMethods); #if PY_VERSION_HEX >= 0x02070000 c_api_object = PyCapsule_New((void *)libnumarray_API, NULL, NULL); if (c_api_object == NULL) { PyErr_Clear(); } #else c_api_object = PyCObject_FromVoidPtr((void *)libnumarray_API, NULL); #endif Chuck > Nadav > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3397 bytes Desc: not available URL: From matt.fearon at agsemail.com Mon May 24 08:43:25 2010 From: matt.fearon at agsemail.com (Matt Fearon) Date: Mon, 24 May 2010 08:43:25 -0400 Subject: [Numpy-discussion] calling C function from Python via f2py In-Reply-To: <710F2847B0018641891D9A21602763605AD40E@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD40E@ex3.envision.co.il> Message-ID: Nadav, Thank you. I believe it is working now, as the pos(2) error is gone. However, though the error is gone, my return variable from the C function is not being updated as if the C code is not executing? Syntax to the call the C function from Python is the following: FFMCcalc.FFMCcalc(T,H,W,ro,Fo) Should this execute the C code? thanks, Matt On Sun, May 23, 2010 at 1:44 AM, Nadav Horesh wrote: > > in test.py change to > > print FFMCcalc.FFMCcalc(T,H,W,ro,Fo) > > As implied from the line > > print FFMCcalc.FFMCcalc.__doc__ > > ?Nadav > > -----Original Message----- > From: numpy-discussion-bounces at scipy.org on behalf of Matt Fearon > Sent: Fri 21-May-10 21:55 > To: numpy-discussion at scipy.org > Subject: [Numpy-discussion] calling C function from Python via f2py > > Hello, > > I am trying to use f2py to generate a wrapped C function that I can > call from Python (passing arguments to and from). I have this almost > working, but I receive trouble with "exp and pow" related to C and > some "pos (2) error" with one of my passed variables. My f2py syntax > is: > > f2py -c -lm FFMCcalc.pyf FFMCcalc.c > > Also, my 3 scripts are short and attached. > > 1. FFMCcalc.c, C function > 2. FFMCcalc.pyf, wrapper file > 3. test.py, short python code that calls C function > > Any advice would greatly appreciated to get this working. > thanks, > Matt > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Mon May 24 12:11:32 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 May 2010 10:11:32 -0600 Subject: [Numpy-discussion] building numpy against Cray xt-libsci In-Reply-To: References: Message-ID: On Mon, May 24, 2010 at 1:57 AM, Amir wrote: > I am trying to build numpy against Cray's xt-libsci library on a Cray XT5. > I am getting an error I am hoping for hints on how to resolve: > > In [1]: import numpy > > 20 isfinite, size > 21 from numpy.lib import triu > ---> 22 from numpy.linalg import lapack_lite > 23 from numpy.matrixlib.defmatrix import matrix_power > 24 > > ImportError: /opt/xt-libsci/10.4.0/gnu/lib/libsci.so: undefined symbol: > fftw_version > > These are the symbols in libsci: > > % nm /opt/xt-libsci/10.4.0/gnu/lib/libsci.so | grep fftw_version > 00000000010f9a30 B __crafft_internal__crafft_fftw_version_num > U fftw_version > 00000000005aa8a4 T get_fftw_version > > > I first built numpy with no custom site.cfg file. It built correctly and > all tests ran. But it was too slow. > > Then I tried building numpy against libsci, which has BLAS, LAPACK, FFTW3 > among other things. I had to build a libcblas.a from the netlib src as > libsci does not have cblas (using gcc, gfortran 4.3.3). Here is my site.cfg, > accumulated from several nice tutorials on how to build numpy on these > machines, which for some reason don't work for me. The instructions were > based on numpy 1.2. > > [blas] > blas_libs = cblas > library_dirs = /global/homes/amir/local/lib > > [lapack] > lapack_libs = sci > library_dirs = /opt/xt-libsci/10.4.0/gnu/lib > > [blas_opt] > blas_libs = cblas, sci > libraries = cblas, sci > > [lapack_opt] > libraries = sci > > [fftw] > libraries = fftw3 > > > Here is what is linked to lapack_lite.so: > > % ldd ./numpy/linalg/lapack_lite.so > libsci.so => /opt/xt-libsci/10.4.0/gnu/lib/libsci.so (0x00002b4493325000) > libgfortran.so.3 => /opt/gcc/4.3.3/snos/lib64/libgfortran.so.3 > (0x00002b44a4579000) > libm.so.6 => /lib64/libm.so.6 (0x00002b44a4770000) > libgcc_s.so.1 => /opt/gcc/4.3.3/snos/lib64/libgcc_s.so.1 > (0x00002b44a48c6000) > libc.so.6 => /lib64/libc.so.6 (0x00002b44a49dd000) > /lib64/ld-linux-x86-64.so.2 (0x0000555555554000) > > Curious, fftw shows up in numpy/distutils/system_info.py and f2py, but I think numpy/scipy no longer support fftw. Maybe we should get rid of the references? In any case, you can probably modify numpy/distutils/system_info.py to fix this problem, as it doesn't seem to show up on other systems. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 24 12:25:25 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 May 2010 10:25:25 -0600 Subject: [Numpy-discussion] Tickets for review involving build problems, attn Ralf Gommers Message-ID: Hi Ralf, As one of the official build gurus in training, could you look over the tickets for review that fix build problems on various platforms? TIA, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 24 14:01:45 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 May 2010 12:01:45 -0600 Subject: [Numpy-discussion] Extending documentation to c code Message-ID: Hi All, I'm wondering if we could extend the current documentation format to the c source code. The string blocks would be implemented something like /**NpyDoc """The Answer. Answer the Ultimate Question of Life, the Universe, and Everything. Parameters ---------- We don't need no stinkin' parameters. Notes ----- The run time of this routine may be excessive. """ */ int answer_ultimate_question(void) { return 42; } and the source scanned to generate the usual documentation. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Mon May 24 14:37:45 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 24 May 2010 21:37:45 +0300 Subject: [Numpy-discussion] calling C function from Python via f2py References: <710F2847B0018641891D9A21602763605AD40E@ex3.envision.co.il> Message-ID: <710F2847B0018641891D9A21602763605AD417@ex3.envision.co.il> Sorry, can not figure it out, if you don't gen an answer on this list maybe you should address it on swig list. Personally I use cython for this purpose. Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Matt Fearon Sent: Mon 24-May-10 15:43 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] calling C function from Python via f2py Nadav, Thank you. I believe it is working now, as the pos(2) error is gone. However, though the error is gone, my return variable from the C function is not being updated as if the C code is not executing? Syntax to the call the C function from Python is the following: FFMCcalc.FFMCcalc(T,H,W,ro,Fo) Should this execute the C code? thanks, Matt On Sun, May 23, 2010 at 1:44 AM, Nadav Horesh wrote: > > in test.py change to > > print FFMCcalc.FFMCcalc(T,H,W,ro,Fo) > > As implied from the line > > print FFMCcalc.FFMCcalc.__doc__ > > ?Nadav > > -----Original Message----- > From: numpy-discussion-bounces at scipy.org on behalf of Matt Fearon > Sent: Fri 21-May-10 21:55 > To: numpy-discussion at scipy.org > Subject: [Numpy-discussion] calling C function from Python via f2py > > Hello, > > I am trying to use f2py to generate a wrapped C function that I can > call from Python (passing arguments to and from). I have this almost > working, but I receive trouble with "exp and pow" related to C and > some "pos (2) error" with one of my passed variables. My f2py syntax > is: > > f2py -c -lm FFMCcalc.pyf FFMCcalc.c > > Also, my 3 scripts are short and attached. > > 1. FFMCcalc.c, C function > 2. FFMCcalc.pyf, wrapper file > 3. test.py, short python code that calls C function > > Any advice would greatly appreciated to get this working. > thanks, > Matt > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3803 bytes Desc: not available URL: From d.l.goldsmith at gmail.com Mon May 24 16:11:48 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 24 May 2010 13:11:48 -0700 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Mon, May 24, 2010 at 11:01 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Hi All, > > I'm wondering if we could extend the current documentation format to the c > source code. The string blocks would be implemented something like > > /**NpyDoc > """The Answer. > > Answer the Ultimate Question of Life, the Universe, and Everything. > > Parameters > ---------- > We don't need no stinkin' parameters. > > Notes > ----- > The run time of this routine may be excessive. > > """ > */ > int > answer_ultimate_question(void) > { > return 42; > } > > and the source scanned to generate the usual documentation. Thoughts? > > Chuck > IMO it would be necessary to make such doc have the same status w.r.t. the Wiki as the Python source; how much tweaking of pydocweb would that require (Pauli is already over-committed in that regard; Joe, Perry, and I are taking steps to try to alleviate this, but nothing is close to materializing yet). I know that as far as Joe and I are concerned, getting pydocweb to support a dual review process is a much higher, longer-standing priority. Also, quoting from the docstring standard: "An optional section for examples...while optional, this section is very strongly encouraged." (Personally, I think this section should be required, not optional, for functions, and methods which require their own docstrings.) But requiring docwriters to supply working (i.e., compilable, linkable, runable) c code examples (which would appear to be necessary because the coders appear to be loath to provide their docstrings with examples) might be asking too much (since we try to keep the doc writing effort open to persons at least comfortable w/ Python, though not necessarily w/ c). Unless and until these concerns can be realistically and successfully addressed, I'm a strong "-1". DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From liukis at usc.edu Mon May 24 18:14:03 2010 From: liukis at usc.edu (Maria Liukis) Date: Mon, 24 May 2010 15:14:03 -0700 Subject: [Numpy-discussion] loadtxt raises an exception on empty file Message-ID: <2E3280C7-6F3F-4844-AFC0-CBAB2B9EEF02@usc.edu> Hello everybody, I'm using numpy V1.3.0 and ran into a case when numpy.loadtxt('foo.txt') raised an exception: >>>import numpy as np >>>np.loadtxt('foo.txt') Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/io.py", line 456, in loadtxt raise IOError('End-of-file reached before encountering data.') IOError: End-of-file reached before encountering data. >>> if provided file 'foo.txt' is empty. Would anybody happen to know if it's a feature or a bug? I would expect it to return an empty array. numpy.fromfile() handles empty text files: >>> np.fromfile('foo.txt', sep='\t\n ') array([], dtype=float64) >>> Would anybody suggest a graceful way of handling empty files with numpy.loadtxt() (except for catching an IOError exception)? Many thanks, Masha -------------------- liukis at usc.edu From gael.varoquaux at normalesup.org Mon May 24 18:25:37 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 25 May 2010 00:25:37 +0200 Subject: [Numpy-discussion] [Patch] Fix memmap pickling Message-ID: <20100524222537.GB32540@phare.normalesup.org> Memmapped arrays don't pickle right. I know that to get them to really pickle and restore identically, we would need some effort. However, in the current status, pickling and restoring a memmapped array leads to tracebacks that seem like they could be avoided. I am attaching a patch with a test that shows the problem, and a fix. Should I create a ticket, or is this light-enough to be applied immediatly? Cheers, Ga?l -------------- next part -------------- A non-text attachment was scrubbed... Name: tmp.diff Type: text/x-diff Size: 2241 bytes Desc: not available URL: From bpederse at gmail.com Mon May 24 18:33:09 2010 From: bpederse at gmail.com (Brent Pedersen) Date: Mon, 24 May 2010 15:33:09 -0700 Subject: [Numpy-discussion] [Patch] Fix memmap pickling In-Reply-To: <20100524222537.GB32540@phare.normalesup.org> References: <20100524222537.GB32540@phare.normalesup.org> Message-ID: On Mon, May 24, 2010 at 3:25 PM, Gael Varoquaux wrote: > Memmapped arrays don't pickle right. I know that to get them to > really pickle and restore identically, we would need some effort. > However, in the current status, pickling and restoring a memmapped array > leads to tracebacks that seem like they could be avoided. > > I am attaching a patch with a test that shows the problem, and a fix. > Should I create a ticket, or is this light-enough to be applied > immediatly? > > Cheers, > > Ga?l > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > also check this: http://projects.scipy.org/numpy/ticket/1452 still needs work. From gael.varoquaux at normalesup.org Mon May 24 18:37:50 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 25 May 2010 00:37:50 +0200 Subject: [Numpy-discussion] [Patch] Fix memmap pickling In-Reply-To: References: <20100524222537.GB32540@phare.normalesup.org> Message-ID: <20100524223750.GC32540@phare.normalesup.org> On Mon, May 24, 2010 at 03:33:09PM -0700, Brent Pedersen wrote: > On Mon, May 24, 2010 at 3:25 PM, Gael Varoquaux > wrote: > > Memmapped arrays don't pickle right. I know that to get them to > > really pickle and restore identically, we would need some effort. > > However, in the current status, pickling and restoring a memmapped array > > leads to tracebacks that seem like they could be avoided. > > I am attaching a patch with a test that shows the problem, and a fix. > > Should I create a ticket, or is this light-enough to be applied > > immediatly? > also check this: > http://projects.scipy.org/numpy/ticket/1452 > still needs work. Does look good. Is there an ETA for your patch to be applied? Right now this bug is making code crash when memmapped arrays are used (eg multiprocessing), so a hot fix can be useful, without removing any merit to your work that addresses the underlying problem. Cheers, Ga?l From charlesr.harris at gmail.com Mon May 24 19:59:41 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 May 2010 17:59:41 -0600 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Mon, May 24, 2010 at 2:11 PM, David Goldsmith wrote: > On Mon, May 24, 2010 at 11:01 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> I'm wondering if we could extend the current documentation format to the c >> source code. The string blocks would be implemented something like >> >> /**NpyDoc >> """The Answer. >> >> Answer the Ultimate Question of Life, the Universe, and Everything. >> >> Parameters >> ---------- >> We don't need no stinkin' parameters. >> >> Notes >> ----- >> The run time of this routine may be excessive. >> >> """ >> */ >> int >> answer_ultimate_question(void) >> { >> return 42; >> } >> >> and the source scanned to generate the usual documentation. Thoughts? >> >> Chuck >> > > IMO it would be necessary to make such doc have the same status w.r.t. the > Wiki as the Python source; how much tweaking of pydocweb would that require > (Pauli is already over-committed in that regard; Joe, Perry, and I are > taking steps to try to alleviate this, but nothing is close to materializing > yet). I know that as far as Joe and I are concerned, getting pydocweb to > support a dual review process is a much higher, longer-standing priority. > > Also, quoting from the docstring standard: "An optional section for > examples...while optional, this section is very strongly encouraged." > (Personally, I think this section should be required, not optional, for > functions, and methods which require their own docstrings.) But requiring > docwriters to supply working (i.e., compilable, linkable, runable) c code > examples (which would appear to be necessary because the coders appear to be > loath to provide their docstrings with examples) might be asking too much > (since we try to keep the doc writing effort open to persons at least > comfortable w/ Python, though not necessarily w/ c). > > Unless and until these concerns can be realistically and successfully > addressed, I'm a strong "-1". > > I'm not interested in having this part of the standard user documentation since the c functions are mostly invisible to the user. What I want is documentation for maintainers/developers of the c code. The c code is essentially undocumented and that makes it difficult to work with, especially for new people. At one time in the past I suggested using doxygen but that didn't seem to arouse much interest. I've also tried generating a call graph but only managed to crash the system... Anyway, it needs to be done at some point and I'm looking for suggestions. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Mon May 24 20:17:37 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 24 May 2010 17:17:37 -0700 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Mon, May 24, 2010 at 4:59 PM, Charles R Harris wrote: > > > On Mon, May 24, 2010 at 2:11 PM, David Goldsmith wrote: > >> On Mon, May 24, 2010 at 11:01 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> Hi All, >>> >>> I'm wondering if we could extend the current documentation format to the >>> c source code. The string blocks would be implemented something like >>> >>> /**NpyDoc >>> """The Answer. >>> >>> Answer the Ultimate Question of Life, the Universe, and Everything. >>> >>> Parameters >>> ---------- >>> We don't need no stinkin' parameters. >>> >>> Notes >>> ----- >>> The run time of this routine may be excessive. >>> >>> """ >>> */ >>> int >>> answer_ultimate_question(void) >>> { >>> return 42; >>> } >>> >>> and the source scanned to generate the usual documentation. Thoughts? >>> >>> Chuck >>> >> >> IMO it would be necessary to make such doc have the same status w.r.t. the >> Wiki as the Python source; how much tweaking of pydocweb would that require >> (Pauli is already over-committed in that regard; Joe, Perry, and I are >> taking steps to try to alleviate this, but nothing is close to materializing >> yet). I know that as far as Joe and I are concerned, getting pydocweb to >> support a dual review process is a much higher, longer-standing priority. >> >> Also, quoting from the docstring standard: "An optional section for >> examples...while optional, this section is very strongly encouraged." >> (Personally, I think this section should be required, not optional, for >> functions, and methods which require their own docstrings.) But requiring >> docwriters to supply working (i.e., compilable, linkable, runable) c code >> examples (which would appear to be necessary because the coders appear to be >> loath to provide their docstrings with examples) might be asking too much >> (since we try to keep the doc writing effort open to persons at least >> comfortable w/ Python, though not necessarily w/ c). >> >> Unless and until these concerns can be realistically and successfully >> addressed, I'm a strong "-1". >> >> > I'm not interested in having this part of the standard user documentation > since the c functions are mostly invisible to the user. What I want is > documentation for maintainers/developers of the c code. The c code is > essentially undocumented and that makes it difficult to work with, > especially for new people. At one time in the past I suggested using doxygen > but that didn't seem to arouse much interest. I've also tried generating a > call graph but only managed to crash the system... Anyway, it needs to be > done at some point and I'm looking for suggestions. > > Chuck > > Not checking in un-/poorly documented new code would be a good start. DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon May 24 20:22:44 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 25 May 2010 08:22:44 +0800 Subject: [Numpy-discussion] Tickets for review involving build problems, attn Ralf Gommers In-Reply-To: References: Message-ID: On Tue, May 25, 2010 at 12:25 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Hi Ralf, > > As one of the official build gurus in training, could you look over the > tickets for review that fix build problems on various platforms? > > Sure, will do that in the next few weeks. Of course I may need the help of an actual guru to make some decisions:) Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 24 22:06:14 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 May 2010 20:06:14 -0600 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Mon, May 24, 2010 at 6:17 PM, David Goldsmith wrote: > On Mon, May 24, 2010 at 4:59 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Mon, May 24, 2010 at 2:11 PM, David Goldsmith > > wrote: >> >>> On Mon, May 24, 2010 at 11:01 AM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> Hi All, >>>> >>>> I'm wondering if we could extend the current documentation format to the >>>> c source code. The string blocks would be implemented something like >>>> >>>> /**NpyDoc >>>> """The Answer. >>>> >>>> Answer the Ultimate Question of Life, the Universe, and Everything. >>>> >>>> Parameters >>>> ---------- >>>> We don't need no stinkin' parameters. >>>> >>>> Notes >>>> ----- >>>> The run time of this routine may be excessive. >>>> >>>> """ >>>> */ >>>> int >>>> answer_ultimate_question(void) >>>> { >>>> return 42; >>>> } >>>> >>>> and the source scanned to generate the usual documentation. Thoughts? >>>> >>>> Chuck >>>> >>> >>> IMO it would be necessary to make such doc have the same status w.r.t. >>> the Wiki as the Python source; how much tweaking of pydocweb would that >>> require (Pauli is already over-committed in that regard; Joe, Perry, and I >>> are taking steps to try to alleviate this, but nothing is close to >>> materializing yet). I know that as far as Joe and I are concerned, getting >>> pydocweb to support a dual review process is a much higher, longer-standing >>> priority. >>> >>> Also, quoting from the docstring standard: "An optional section for >>> examples...while optional, this section is very strongly encouraged." >>> (Personally, I think this section should be required, not optional, for >>> functions, and methods which require their own docstrings.) But requiring >>> docwriters to supply working (i.e., compilable, linkable, runable) c code >>> examples (which would appear to be necessary because the coders appear to be >>> loath to provide their docstrings with examples) might be asking too much >>> (since we try to keep the doc writing effort open to persons at least >>> comfortable w/ Python, though not necessarily w/ c). >>> >>> Unless and until these concerns can be realistically and successfully >>> addressed, I'm a strong "-1". >>> >>> >> I'm not interested in having this part of the standard user documentation >> since the c functions are mostly invisible to the user. What I want is >> documentation for maintainers/developers of the c code. The c code is >> essentially undocumented and that makes it difficult to work with, >> especially for new people. At one time in the past I suggested using doxygen >> but that didn't seem to arouse much interest. I've also tried generating a >> call graph but only managed to crash the system... Anyway, it needs to be >> done at some point and I'm looking for suggestions. >> >> Chuck >> >> Not checking in un-/poorly documented new code would be a good start. > > So exactly how should the c code be documented? There is currently no documentation standard for c code. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Mon May 24 23:06:09 2010 From: cournape at gmail.com (David Cournapeau) Date: Tue, 25 May 2010 12:06:09 +0900 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Tue, May 25, 2010 at 3:01 AM, Charles R Harris wrote: > Hi All, > > I'm wondering if we could extend the current documentation format to the c > source code. The string blocks would be implemented something like > > /**NpyDoc > """The Answer. > > Answer the Ultimate Question of Life, the Universe, and Everything. > > Parameters > ---------- > We don't need no stinkin' parameters. > > Notes > ----- > The run time of this routine may be excessive. > > """ > */ > int > answer_ultimate_question(void) > { > ??? return 42; > } > > and the source scanned to generate the usual documentation. Thoughts? I have thought about this for quite some time, but it is not easy. Docstrings are useful because of cross references, etc... and documentation for compiled code should contain signature extraction. For those reasons, I think a doc tool would need to parse C, which makes the problem that much harder. Last time I looked, synopsis was interesting, but it does not seem to have caught up. Synopsis was interesting because it was modular, scriptable in python, and supported rest as a markup language within C code. OTOH, I hope that clang will change the game here - it gives a modular, robust C (and soon C++) parser, and having a documentation tool written from that is just a question of time I think. Maybe as a first step, something that could extract function signature would be enough, and writing this should not take too much time (Sebastien B wrote something which could be a start, to autogenerate cython code from header:http://bitbucket.org/binet/cylon). David From vincent at vincentdavis.net Tue May 25 00:49:48 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Mon, 24 May 2010 22:49:48 -0600 Subject: [Numpy-discussion] loadtxt raises an exception on empty file In-Reply-To: <2E3280C7-6F3F-4844-AFC0-CBAB2B9EEF02@usc.edu> References: <2E3280C7-6F3F-4844-AFC0-CBAB2B9EEF02@usc.edu> Message-ID: On Mon, May 24, 2010 at 4:14 PM, Maria Liukis wrote: > Hello everybody, > > I'm using numpy V1.3.0 and ran into a case when numpy.loadtxt('foo.txt') > raised an exception: > > >>>import numpy as np > >>>np.loadtxt('foo.txt') > Traceback (most recent call last): > File "", line 1, in > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/io.py", > line 456, in loadtxt > raise IOError('End-of-file reached before encountering data.') > IOError: End-of-file reached before encountering data. > >>> > > if provided file 'foo.txt' is empty. > > Would anybody happen to know if it's a feature or a bug? I would expect it > to return an empty array. > Looking at the source for loadtxt line 591 # Read until we find a line with some values, and use # it to estimate the number of columns, N. first_vals = None while not first_vals: first_line = fh.readline() if first_line == '': # EOF reached raise IOError('End-of-file reached before encountering data.') So it looks like it is not a bug although I am not sure why returning an empty array would not be valid. But then what are you going to do with the empty array? Vincent > > numpy.fromfile() handles empty text files: > > >>> np.fromfile('foo.txt', sep='\t\n ') > array([], dtype=float64) > >>> > > Would anybody suggest a graceful way of handling empty files with > numpy.loadtxt() (except for catching an IOError exception)? > > Many thanks, > Masha > -------------------- > liukis at usc.edu > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > *Vincent Davis 720-301-3003 * vincent at vincentdavis.net my blog | LinkedIn -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Tue May 25 01:09:46 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 24 May 2010 22:09:46 -0700 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Mon, May 24, 2010 at 8:06 PM, David Cournapeau wrote: > On Tue, May 25, 2010 at 3:01 AM, Charles R Harris > wrote: > > Hi All, > > > > I'm wondering if we could extend the current documentation format to the > c > > source code. The string blocks would be implemented something like > > > > /**NpyDoc > > """The Answer. > > > > Answer the Ultimate Question of Life, the Universe, and Everything. > > > > Parameters > > ---------- > > We don't need no stinkin' parameters. > > > > Notes > > ----- > > The run time of this routine may be excessive. > > > > """ > > */ > > int > > answer_ultimate_question(void) > > { > > return 42; > > } > > > > and the source scanned to generate the usual documentation. Thoughts? > > I have thought about this for quite some time, but it is not easy. > Docstrings are useful because of cross references, etc... and > documentation for compiled code should contain signature extraction. > For those reasons, I think a doc tool would need to parse C, which > makes the problem that much harder. > > Last time I looked, synopsis was interesting, but it does not seem to > have caught up. Synopsis was interesting because it was modular, > scriptable in python, and supported rest as a markup language within C > code. OTOH, I hope that clang will change the game here - it gives a > modular, robust C (and soon C++) parser, and having a documentation > tool written from that is just a question of time I think. > > Maybe as a first step, something that could extract function signature > would be enough, and writing this should not take too much time > (Sebastien B wrote something which could be a start, to autogenerate > cython code from header:http://bitbucket.org/binet/cylon). > > David > This does sound promising/a good first step. But it doesn't really answer Charles' question about a standard (which would be useful to have to help guide doc editor design). My proposal is that we start w/ what we have - the standard for our Python code - and figure out what makes sense to keep, add, change, and throw out. If we don't yet have an SEP process, perhaps this need could serve as a first test case; obviously, if we already do have an SEP, then we should follow that. DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Tue May 25 01:45:35 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Tue, 25 May 2010 08:45:35 +0300 Subject: [Numpy-discussion] loadtxt raises an exception on empty file References: <2E3280C7-6F3F-4844-AFC0-CBAB2B9EEF02@usc.edu> Message-ID: <710F2847B0018641891D9A21602763605AD418@ex3.envision.co.il> You can just catch the exception and decide what to do with it: try: data = np.loadtxt('foo.txt') except IOError: data = 0 # Or something similar Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Maria Liukis Sent: Tue 25-May-10 01:14 To: numpy-discussion at scipy.org Subject: [Numpy-discussion] loadtxt raises an exception on empty file Hello everybody, I'm using numpy V1.3.0 and ran into a case when numpy.loadtxt('foo.txt') raised an exception: >>>import numpy as np >>>np.loadtxt('foo.txt') Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/io.py", line 456, in loadtxt raise IOError('End-of-file reached before encountering data.') IOError: End-of-file reached before encountering data. >>> if provided file 'foo.txt' is empty. Would anybody happen to know if it's a feature or a bug? I would expect it to return an empty array. numpy.fromfile() handles empty text files: >>> np.fromfile('foo.txt', sep='\t\n ') array([], dtype=float64) >>> Would anybody suggest a graceful way of handling empty files with numpy.loadtxt() (except for catching an IOError exception)? Many thanks, Masha -------------------- liukis at usc.edu _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3377 bytes Desc: not available URL: From seb.binet at gmail.com Tue May 25 03:04:41 2010 From: seb.binet at gmail.com (Sebastien Binet) Date: Tue, 25 May 2010 09:04:41 +0200 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: <1274770889-sup-8976@farnsworth> Excerpts from David Cournapeau's message of 2010-05-25 05:06:09 +0200: > On Tue, May 25, 2010 at 3:01 AM, Charles R Harris [snip] > Maybe as a first step, something that could extract function signature > would be enough, and writing this should not take too much time > (Sebastien B wrote something which could be a start, to autogenerate > cython code from header:http://bitbucket.org/binet/cylon). note that llvm/clang is versatile enough to easily provide indices into the source code, which of course includes the comments... I am actually working on improving the python bindings to clang (which are already quite useful for this thread's topic as they are used for code completing C/C++ code - but are not yet complete enough for providing complete function signatures) cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From cournape at gmail.com Tue May 25 03:30:58 2010 From: cournape at gmail.com (David Cournapeau) Date: Tue, 25 May 2010 16:30:58 +0900 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Tue, May 25, 2010 at 2:09 PM, David Goldsmith wrote: > > This does sound promising/a good first step.? But it doesn't really answer > Charles' question about a standard (which would be useful to have to help > guide doc editor design). it does - I looked into synopsis because we could use rest, and I don't think anyone wants to go the doxygen route. Just putting rest comments into sources is not useful (since just *extracting* them is non trivial for C/C++). I think the documentation project taught us that being able to build a decent looking document is required for people to actually document things. Also, I may have not been clear, but when I said I thought about it, I meant I have tried it and it did not work after one hour of two of tinkering (then I realized that you need to parse C to do anything useful). David From pav at iki.fi Tue May 25 03:34:44 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 25 May 2010 09:34:44 +0200 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: <1274772884.2045.5.camel@talisman> Hi, ma, 2010-05-24 kello 12:01 -0600, Charles R Harris kirjoitti: > I'm wondering if we could extend the current documentation format to > the c source code. The string blocks would be implemented something > like [clip] I'd perhaps stick closer to C conventions and use something like /** * Spam foo out of the parrots. * * Parameters * ---------- * a * Amount of spam * b * Number of parrots */ int foo(int a, int b) { } Using some JavaDoc-type syntax might also be possible, although I personally find it rather ugly. Also, parsing C might not be too difficult, as projects aimed doing that in Python exist, for instance http://code.google.com/p/pycparser/ -- Pauli Virtanen From cournape at gmail.com Tue May 25 03:51:36 2010 From: cournape at gmail.com (David Cournapeau) Date: Tue, 25 May 2010 16:51:36 +0900 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: <1274770889-sup-8976@farnsworth> References: <1274770889-sup-8976@farnsworth> Message-ID: On Tue, May 25, 2010 at 4:04 PM, Sebastien Binet wrote: > Excerpts from David Cournapeau's message of 2010-05-25 05:06:09 +0200: >> On Tue, May 25, 2010 at 3:01 AM, Charles R Harris > [snip] >> Maybe as a first step, something that could extract function signature >> would be enough, and writing this should not take too much time >> (Sebastien B wrote something which could be a start, to autogenerate >> cython code from header:http://bitbucket.org/binet/cylon). > > note that llvm/clang is versatile enough to easily provide indices into > the source code, which of course includes the comments... ?I am actually > working on improving the python bindings to clang Ah, nice - where do you put your work ? It seems that llvm-py does not have recent commits, David From seb.binet at gmail.com Tue May 25 03:56:18 2010 From: seb.binet at gmail.com (Sebastien Binet) Date: Tue, 25 May 2010 09:56:18 +0200 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: <1274770889-sup-8976@farnsworth> Message-ID: <1274774000-sup-555@farnsworth> Excerpts from David Cournapeau's message of 2010-05-25 09:51:36 +0200: > On Tue, May 25, 2010 at 4:04 PM, Sebastien Binet wrote: > > Excerpts from David Cournapeau's message of 2010-05-25 05:06:09 +0200: > >> On Tue, May 25, 2010 at 3:01 AM, Charles R Harris > > [snip] > >> Maybe as a first step, something that could extract function signature > >> would be enough, and writing this should not take too much time > >> (Sebastien B wrote something which could be a start, to autogenerate > >> cython code from header:http://bitbucket.org/binet/cylon). > > > > note that llvm/clang is versatile enough to easily provide indices into > > the source code, which of course includes the comments... ?I am actually > > working on improving the python bindings to clang > > Ah, nice - where do you put your work ? It seems that llvm-py does not > have recent commits, I am talking about the clang python bindings, not the llvm ones :) I am pushing stuff over there: http://llvm.org/viewvc/llvm-project/cfe/trunk/bindings/python/ FYI, my cylon project is waiting for this last patch to be applied: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2010-May/009091.html then world domination... cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From daniele at grinta.net Tue May 25 05:37:50 2010 From: daniele at grinta.net (Daniele Nicolodi) Date: Tue, 25 May 2010 11:37:50 +0200 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: <1274770889-sup-8976@farnsworth> References: <1274770889-sup-8976@farnsworth> Message-ID: <4BFB9A6E.2080807@grinta.net> On 25/05/10 09:04, Sebastien Binet wrote: > note that llvm/clang is versatile enough to easily provide indices into > the source code, which of course includes the comments... I am actually > working on improving the python bindings to clang (which are already > quite useful for this thread's topic as they are used for code > completing C/C++ code - but are not yet complete enough for providing > complete function signatures) Have you seen http://code.google.com/p/pycparser/ ? It is a pure python implementation with small or no external dependencies. I was thinking about using it for writing a simple cython interfaces generator. I do not know if it supports extracting comments, but I think it would be simple to extend it this way. Cheers, -- Daniele From seb.binet at gmail.com Tue May 25 05:57:24 2010 From: seb.binet at gmail.com (Sebastien Binet) Date: Tue, 25 May 2010 11:57:24 +0200 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: <4BFB9A6E.2080807@grinta.net> References: <1274770889-sup-8976@farnsworth> <4BFB9A6E.2080807@grinta.net> Message-ID: <1274781231-sup-611@farnsworth> Excerpts from Daniele Nicolodi's message of 2010-05-25 11:37:50 +0200: > On 25/05/10 09:04, Sebastien Binet wrote: > > > note that llvm/clang is versatile enough to easily provide indices into > > the source code, which of course includes the comments... I am actually > > working on improving the python bindings to clang (which are already > > quite useful for this thread's topic as they are used for code > > completing C/C++ code - but are not yet complete enough for providing > > complete function signatures) > > Have you seen http://code.google.com/p/pycparser/ ? yes, and I have been using it to teach myself PyParsing. the problem is that I need to be able to parse C++, not just C. Parsing (correctly) C++ is a full blown project and I prefer to rely on the work of others :) cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From matt.fearon at agsemail.com Tue May 25 08:21:22 2010 From: matt.fearon at agsemail.com (Matt Fearon) Date: Tue, 25 May 2010 08:21:22 -0400 Subject: [Numpy-discussion] calling C function from Python via f2py In-Reply-To: <710F2847B0018641891D9A21602763605AD417@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD40E@ex3.envision.co.il> <710F2847B0018641891D9A21602763605AD417@ex3.envision.co.il> Message-ID: Thanks for your time and assistance, Nadav. I will look into the SWIG list and or cython. On Mon, May 24, 2010 at 2:37 PM, Nadav Horesh wrote: > > Sorry, can not figure it out, if you don't gen an answer on this list maybe you should address it on swig list. Personally I use cython for this purpose. > > ? Nadav > > -----Original Message----- > From: numpy-discussion-bounces at scipy.org on behalf of Matt Fearon > Sent: Mon 24-May-10 15:43 > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] calling C function from Python via f2py > > Nadav, > > Thank you. I believe it is working now, as the pos(2) error is gone. > However, though the error is gone, my return variable from the C > function is not being updated as if the C code is not executing? > Syntax to the call the C function from Python is the following: > > FFMCcalc.FFMCcalc(T,H,W,ro,Fo) > > Should this execute the C code? > > thanks, > Matt > > > On Sun, May 23, 2010 at 1:44 AM, Nadav Horesh wrote: >> >> in test.py change to >> >> print FFMCcalc.FFMCcalc(T,H,W,ro,Fo) >> >> As implied from the line >> >> print FFMCcalc.FFMCcalc.__doc__ >> >> ?Nadav >> >> -----Original Message----- >> From: numpy-discussion-bounces at scipy.org on behalf of Matt Fearon >> Sent: Fri 21-May-10 21:55 >> To: numpy-discussion at scipy.org >> Subject: [Numpy-discussion] calling C function from Python via f2py >> >> Hello, >> >> I am trying to use f2py to generate a wrapped C function that I can >> call from Python (passing arguments to and from). I have this almost >> working, but I receive trouble with "exp and pow" related to C and >> some "pos (2) error" with one of my passed variables. My f2py syntax >> is: >> >> f2py -c -lm FFMCcalc.pyf FFMCcalc.c >> >> Also, my 3 scripts are short and attached. >> >> 1. FFMCcalc.c, C function >> 2. FFMCcalc.pyf, wrapper file >> 3. test.py, short python code that calls C function >> >> Any advice would greatly appreciated to get this working. >> thanks, >> Matt >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From ben.root at ou.edu Tue May 25 10:18:28 2010 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 25 May 2010 09:18:28 -0500 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: > it does - I looked into synopsis because we could use rest, and I > don't think anyone wants to go the doxygen route. I am curious as to why doxygen isn't a viable option. While I don't have experience with the other suggestions, I have used doxygen in a few of my personall projects and have been quite happy with it serving as internal documentation. doxygen can still produce bare-bones documentation, call-graphs and cross-references from C/C++ code even if there are no special comments for it to parse. While doxygen may not be perfect, I think it does well enough to produce good documentation for developers to use. Just my 2 cents. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Tue May 25 12:54:11 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 25 May 2010 09:54:11 -0700 Subject: [Numpy-discussion] __eq__ with str and object Message-ID: I don't understand this: >> a1 = np.array(['a', 'b'], dtype=object) >> a2 = np.array(['a', 'b']) >> >> a1 == a2 array([ True, True], dtype=bool) # Looks good >> a2 == a1 False # Should I have expected this? This works like I expected: >> a1 = np.array([1, 2], dtype=object) >> a2 = np.array([1, 2]) >> >> a1 == a2 array([ True, True], dtype=bool) >> a2 == a1 array([ True, True], dtype=bool) From jh at physics.ucf.edu Tue May 25 14:39:44 2010 From: jh at physics.ucf.edu (Joe Harrington) Date: Tue, 25 May 2010 14:39:44 -0400 Subject: [Numpy-discussion] ensuring docstrings in new code Message-ID: Over on [Numpy-discussion] Extending documentation to c code, David G. gave voice to a frustration he and I share about the status of documentation in the new-code development process. I don't want to paint with a broad brush, yet in recent months there have been a number of checkins, unanimously passed off by the entire core development group/Steering Committee, without a single mention that this code entered SVN with deficient or even completely absent docstrings. I'm not pointing out the offenses simply because I don't think public embarrassment is a good policy, but David has a list that we can share with developers privately. It is really hard to do much without docs on anything you didn't write yourself. When I learned how to program, it was beaten into me to document as I went along, that good projects always did, and that major problems up to and including a complete implosion and abandonment of the software could and sometimes did result if docs were not kept current. The Doc Project is not the writing arm of the developers. So far, we have only worked on the existing software, not anything new, and most of us don't even have SVN access. Rather, the Doc Project is cleaning up what others, in their haste, did not do, and it frustrates us to see others still not doing it. What change of procedures would ensure that no code, possibly including bug fixes, enters SVN without a ready-for-review docstring? Perhaps new code could be reviewed according to a checklist that includes "has a ready-for-review docstring". Or perhaps announcements to the lists of pending code checkins should include [CHECKIN] in the subject, to attract the attention of the Doc Project's Editor (i.e., David), so that he can act as an outside cop. Or maybe the SVN can somehow be made to check for docstrings, or to ask for them on checkin. Or maybe something in the build could make such a list and compare it to the last list and complain if there are new routines on it. We've called for a rule of no new code without docs many times. I'm not sure of the best solution, but we need something or things won't change. It's not like these are hard to write. The lack of a ready-for-review docstring is a major bug in itself. --jh-- From oliphant at enthought.com Tue May 25 15:37:25 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Tue, 25 May 2010 14:37:25 -0500 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought Message-ID: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> Hi everyone, There has been some talk about re-factoring NumPy to separate out the Python C-API layer and make NumPy closer to a C-library. I know there are a few different ideas about what this means, and also that people are very busy. I also know there is a NumPy 2.0 release that is in the works. I'm excited to let everyone know that we (at Enthought) have been able to find resources (about 3 man months) to work on this re-factoring project and Scott and Jason (both very experienced C and Python programmers) are actively pursuing it. My hope is that NumPy 2.0 will contain this re-factoring (which should be finished just after SciPy 2010 --- where I'm going to organize a Sprint on NumPy which will include at least date-time improvements and re-factoring work). While we have specific goals for the re-factoring, we want this activity to be fully integrated with the NumPy community and Scott and Jason want to interact with the community as much as feasible as they suggest re-factoring changes (though they both have more experience with phone-conversations to resolve concerns than email chains and so some patience from everybody will be appreciated). Because Jason and Scott are new to this mailing list (but not new to NumPy), I wanted to introduce them so they would feel more comfortable posting questions and people would have some context as to what they were trying to do. Scott and Jason are both very proficient and skilled programmers and I have full confidence in their abilities. That said, we very much want the input of as many people as possible as we pursue the goal of grouping together more tightly the Python C-API interface layer to NumPy. I will be involved in some of the discussions, but am currently on a different project which has tight schedules and so I will only be able to provide limited "mailing-list" visibility. Best regards, -Travis From charlesr.harris at gmail.com Tue May 25 15:50:34 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 May 2010 13:50:34 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> Message-ID: On Tue, May 25, 2010 at 1:37 PM, Travis Oliphant wrote: > > Hi everyone, > > There has been some talk about re-factoring NumPy to separate out the > Python C-API layer and make NumPy closer to a C-library. I know > there are a few different ideas about what this means, and also that > people are very busy. I also know there is a NumPy 2.0 release that > is in the works. > > I'm excited to let everyone know that we (at Enthought) have been able > to find resources (about 3 man months) to work on this re-factoring > project and Scott and Jason (both very experienced C and Python > programmers) are actively pursuing it. My hope is that NumPy 2.0 > will contain this re-factoring (which should be finished just after > SciPy 2010 --- where I'm going to organize a Sprint on NumPy which > will include at least date-time improvements and re-factoring work). > > While we have specific goals for the re-factoring, we want this > activity to be fully integrated with the NumPy community and Scott and > Jason want to interact with the community as much as feasible as they > suggest re-factoring changes (though they both have more experience > with phone-conversations to resolve concerns than email chains and so > some patience from everybody will be appreciated). > > Because Jason and Scott are new to this mailing list (but not new to > NumPy), I wanted to introduce them so they would feel more > comfortable posting questions and people would have some context as to > what they were trying to do. > > Scott and Jason are both very proficient and skilled programmers and I > have full confidence in their abilities. That said, we very much > want the input of as many people as possible as we pursue the goal of > grouping together more tightly the Python C-API interface layer to > NumPy. > > I will be involved in some of the discussions, but am currently on a > different project which has tight schedules and so I will only be able > to provide limited "mailing-list" visibility. > > I think 2.0 would be a bit early for this. Is there any reason it couldn't be done in 2.1? What is the planned policy with regards to the visible interface for extensions? It would also be nice to have a rough idea of how the resulting code would be layered, i.e., what is the design for this re-factoring. Simply having a design would be a major step forward. In any case, I think the primary goal for 2.0 should remain the python3k port. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdroe at stsci.edu Tue May 25 16:21:20 2010 From: mdroe at stsci.edu (Michael Droettboom) Date: Tue, 25 May 2010 16:21:20 -0400 Subject: [Numpy-discussion] __eq__ with str and object In-Reply-To: References: Message-ID: <4BFC3140.7090708@stsci.edu> Seems like a bug to me. Certain branches in _array_richcompare return False to fail rather than Py_NotImplemented, which means the string-understanding comparison fallbacks don't run. Attached is a (simple) patch that resolves this bug, and doesn't seem to cause any of the unit tests to fail. Does this make sense to someone with a better understanding of the rich comparison code than I? Mike On 05/25/2010 12:54 PM, Keith Goodman wrote: >>> a1 = np.array(['a', 'b'], dtype=object) >>> >> a2 = np.array(['a', 'b']) >>> >> >>> >> a1 == a2 >>> > array([ True, True], dtype=bool) # Looks good > >>> >> a2 == a1 >>> > False # Should I have expected this? > > -- Michael Droettboom Science Software Branch Space Telescope Science Institute Baltimore, Maryland, USA -------------- next part -------------- A non-text attachment was scrubbed... Name: char_eq.diff Type: text/x-patch Size: 2049 bytes Desc: not available URL: From oliphant at enthought.com Tue May 25 16:54:26 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Tue, 25 May 2010 15:54:26 -0500 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> Message-ID: <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> On May 25, 2010, at 2:50 PM, Charles R Harris wrote: > > > On Tue, May 25, 2010 at 1:37 PM, Travis Oliphant > wrote: > > Hi everyone, > > There has been some talk about re-factoring NumPy to separate out the > Python C-API layer and make NumPy closer to a C-library. I know > there are a few different ideas about what this means, and also that > people are very busy. I also know there is a NumPy 2.0 release that > is in the works. > > I'm excited to let everyone know that we (at Enthought) have been able > to find resources (about 3 man months) to work on this re-factoring > project and Scott and Jason (both very experienced C and Python > programmers) are actively pursuing it. My hope is that NumPy 2.0 > will contain this re-factoring (which should be finished just after > SciPy 2010 --- where I'm going to organize a Sprint on NumPy which > will include at least date-time improvements and re-factoring work). > > While we have specific goals for the re-factoring, we want this > activity to be fully integrated with the NumPy community and Scott and > Jason want to interact with the community as much as feasible as they > suggest re-factoring changes (though they both have more experience > with phone-conversations to resolve concerns than email chains and so > some patience from everybody will be appreciated). > > Because Jason and Scott are new to this mailing list (but not new to > NumPy), I wanted to introduce them so they would feel more > comfortable posting questions and people would have some context as to > what they were trying to do. > > Scott and Jason are both very proficient and skilled programmers and I > have full confidence in their abilities. That said, we very much > want the input of as many people as possible as we pursue the goal of > grouping together more tightly the Python C-API interface layer to > NumPy. > > I will be involved in some of the discussions, but am currently on a > different project which has tight schedules and so I will only be able > to provide limited "mailing-list" visibility. > > > I think 2.0 would be a bit early for this. Is there any reason it > couldn't be done in 2.1? What is the planned policy with regards to > the visible interface for extensions? It would also be nice to have > a rough idea of how the resulting code would be layered, i.e., what > is the design for this re-factoring. Simply having a design would be > a major step forward. The problem with doing it in 2.1 is that this re-factoring will require extensions to be re-built. The visible interface to extensions will not change, but there will likely be ABI incompatibility. It seems prudent to do this in NumPy 2.0. Perhaps we can also put in place the ABI-protecting indirection approaches that David C. was suggesting earlier. Some aspects of the design are still being fleshed out, but the basic idea is to separate out a core library that is as independent of the Python C-API as possible. There will likely be at least some dependency on the Python C-API (reference counting and error handling and possibly others) which any interface would have to provide in a very simple Python.h -- equivalent, for example. Our purpose is to allow NumPy to be integrated with other languages or other frameworks systems without explicitly relying on CPython. There are a lot of questions as to how this will work, and so much of that is being worked out. Part of the reason for this mail is to help ensure that as much of this discussion as possible takes place in public. -Travis > > In any case, I think the primary goal for 2.0 should remain the > python3k port. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliphant at enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 25 17:19:35 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 May 2010 15:19:35 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Tue, May 25, 2010 at 2:54 PM, Travis Oliphant wrote: > > On May 25, 2010, at 2:50 PM, Charles R Harris wrote: > > > > On Tue, May 25, 2010 at 1:37 PM, Travis Oliphant wrote: > >> >> Hi everyone, >> >> There has been some talk about re-factoring NumPy to separate out the >> Python C-API layer and make NumPy closer to a C-library. I know >> there are a few different ideas about what this means, and also that >> people are very busy. I also know there is a NumPy 2.0 release that >> is in the works. >> >> I'm excited to let everyone know that we (at Enthought) have been able >> to find resources (about 3 man months) to work on this re-factoring >> project and Scott and Jason (both very experienced C and Python >> programmers) are actively pursuing it. My hope is that NumPy 2.0 >> will contain this re-factoring (which should be finished just after >> SciPy 2010 --- where I'm going to organize a Sprint on NumPy which >> will include at least date-time improvements and re-factoring work). >> >> While we have specific goals for the re-factoring, we want this >> activity to be fully integrated with the NumPy community and Scott and >> Jason want to interact with the community as much as feasible as they >> suggest re-factoring changes (though they both have more experience >> with phone-conversations to resolve concerns than email chains and so >> some patience from everybody will be appreciated). >> >> Because Jason and Scott are new to this mailing list (but not new to >> NumPy), I wanted to introduce them so they would feel more >> comfortable posting questions and people would have some context as to >> what they were trying to do. >> >> Scott and Jason are both very proficient and skilled programmers and I >> have full confidence in their abilities. That said, we very much >> want the input of as many people as possible as we pursue the goal of >> grouping together more tightly the Python C-API interface layer to >> NumPy. >> >> I will be involved in some of the discussions, but am currently on a >> different project which has tight schedules and so I will only be able >> to provide limited "mailing-list" visibility. >> >> > I think 2.0 would be a bit early for this. Is there any reason it couldn't > be done in 2.1? What is the planned policy with regards to the visible > interface for extensions? It would also be nice to have a rough idea of how > the resulting code would be layered, i.e., what is the design for this > re-factoring. Simply having a design would be a major step forward. > > > The problem with doing it in 2.1 is that this re-factoring will require > extensions to be re-built. The visible interface to extensions will not > change, but there will likely be ABI incompatibility. It seems prudent to > do this in NumPy 2.0. Perhaps we can also put in place the ABI-protecting > indirection approaches that David C. was suggesting earlier. > > Some aspects of the design are still being fleshed out, but the basic idea > is to separate out a core library that is as independent of the Python C-API > as possible. There will likely be at least some dependency on the Python > C-API (reference counting and error handling and possibly others) which any > interface would have to provide in a very simple Python.h -- equivalent, for > example. > > Our purpose is to allow NumPy to be integrated with other languages or > other frameworks systems without explicitly relying on CPython. There are > a lot of questions as to how this will work, and so much of that is being > worked out. Part of the reason for this mail is to help ensure that as > much of this discussion as possible takes place in public. > > Sounds good, but what if it doesn't get finished in a few months? I think we should get 2.0.0 out pronto, ideally it would already have been released. I think a major refactoring like this proposal should get the 3.0.0 label. Admittedly that makes keeping a refactored branch current with fixes going into the trunk a hassle, but perhaps that can be worked around somewhat by clearly labeling what files will be touched in the refactoring and possibly rearranging the content of the existing files. This requires a game plan and a clear idea of the goal. Put simply, I think the proposed schedule is too ambitious and needs to be fleshed out. This refactoring isn't going to be as straight forward as the python3k port because a lot of design decisions need to be made along the way. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Tue May 25 17:49:33 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 25 May 2010 14:49:33 -0700 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: Travis: do you already have a place on the NumPy Development Wikiwhere you're (b)logging your design decisions? Seems like a good way for concerned parties to monitor your choices in more or less real time and thus provide comment in a timely fashion. DG On Tue, May 25, 2010 at 2:19 PM, Charles R Harris wrote: > > > On Tue, May 25, 2010 at 2:54 PM, Travis Oliphant wrote: > >> >> On May 25, 2010, at 2:50 PM, Charles R Harris wrote: >> >> >> >> On Tue, May 25, 2010 at 1:37 PM, Travis Oliphant wrote: >> >>> >>> Hi everyone, >>> >>> There has been some talk about re-factoring NumPy to separate out the >>> Python C-API layer and make NumPy closer to a C-library. I know >>> there are a few different ideas about what this means, and also that >>> people are very busy. I also know there is a NumPy 2.0 release that >>> is in the works. >>> >>> I'm excited to let everyone know that we (at Enthought) have been able >>> to find resources (about 3 man months) to work on this re-factoring >>> project and Scott and Jason (both very experienced C and Python >>> programmers) are actively pursuing it. My hope is that NumPy 2.0 >>> will contain this re-factoring (which should be finished just after >>> SciPy 2010 --- where I'm going to organize a Sprint on NumPy which >>> will include at least date-time improvements and re-factoring work). >>> >>> While we have specific goals for the re-factoring, we want this >>> activity to be fully integrated with the NumPy community and Scott and >>> Jason want to interact with the community as much as feasible as they >>> suggest re-factoring changes (though they both have more experience >>> with phone-conversations to resolve concerns than email chains and so >>> some patience from everybody will be appreciated). >>> >>> Because Jason and Scott are new to this mailing list (but not new to >>> NumPy), I wanted to introduce them so they would feel more >>> comfortable posting questions and people would have some context as to >>> what they were trying to do. >>> >>> Scott and Jason are both very proficient and skilled programmers and I >>> have full confidence in their abilities. That said, we very much >>> want the input of as many people as possible as we pursue the goal of >>> grouping together more tightly the Python C-API interface layer to >>> NumPy. >>> >>> I will be involved in some of the discussions, but am currently on a >>> different project which has tight schedules and so I will only be able >>> to provide limited "mailing-list" visibility. >>> >>> >> I think 2.0 would be a bit early for this. Is there any reason it couldn't >> be done in 2.1? What is the planned policy with regards to the visible >> interface for extensions? It would also be nice to have a rough idea of how >> the resulting code would be layered, i.e., what is the design for this >> re-factoring. Simply having a design would be a major step forward. >> >> >> The problem with doing it in 2.1 is that this re-factoring will require >> extensions to be re-built. The visible interface to extensions will not >> change, but there will likely be ABI incompatibility. It seems prudent to >> do this in NumPy 2.0. Perhaps we can also put in place the ABI-protecting >> indirection approaches that David C. was suggesting earlier. >> >> Some aspects of the design are still being fleshed out, but the basic idea >> is to separate out a core library that is as independent of the Python C-API >> as possible. There will likely be at least some dependency on the Python >> C-API (reference counting and error handling and possibly others) which any >> interface would have to provide in a very simple Python.h -- equivalent, for >> example. >> >> Our purpose is to allow NumPy to be integrated with other languages or >> other frameworks systems without explicitly relying on CPython. There are >> a lot of questions as to how this will work, and so much of that is being >> worked out. Part of the reason for this mail is to help ensure that as >> much of this discussion as possible takes place in public. >> >> > Sounds good, but what if it doesn't get finished in a few months? I think > we should get 2.0.0 out pronto, ideally it would already have been released. > I think a major refactoring like this proposal should get the 3.0.0 label. > Admittedly that makes keeping a refactored branch current with fixes going > into the trunk a hassle, but perhaps that can be worked around somewhat by > clearly labeling what files will be touched in the refactoring and possibly > rearranging the content of the existing files. This requires a game plan and > a clear idea of the goal. Put simply, I think the proposed schedule is too > ambitious and needs to be fleshed out. This refactoring isn't going to be > as straight forward as the python3k port because a lot of design decisions > need to be made along the way. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue May 25 17:57:27 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 26 May 2010 06:57:27 +0900 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Tue, May 25, 2010 at 11:18 PM, Benjamin Root wrote: > >> it does - I looked into synopsis because we could use rest, and I >> don't think anyone wants to go the doxygen route. > > I am curious as to why doxygen isn't a viable option.? While I don't have > experience with the other suggestions, I have used doxygen in a few of my > personall projects and have been quite happy with it serving as internal > documentation. It is yet another format to use inside C sources (I don't think doxygen supports rest), and I would rather have something that is similar, ideally integrated into sphinx. It also generates rather ugly doc by default, David From cournape at gmail.com Tue May 25 18:06:20 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 26 May 2010 07:06:20 +0900 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 6:19 AM, Charles R Harris wrote: > Sounds good, but what if it doesn't get finished in a few months? I think we > should get 2.0.0 out pronto, ideally it would already have been released. I > think a major refactoring like this proposal should get the 3.0.0 label. Naming it 3.0 or 2.1 does not matter much - I think we should avoid breaking things twice. I can see a few solutions: - postpone 2.0 "indefinitely", until this new work is done - backport py3k changes to 1.5 (which would be API and ABI compatible with 1.4.1), and 2.0 would contain all the breaking changes. I am really worried about breaking things once now and once in a few months (or even a year). David From jh at physics.ucf.edu Tue May 25 18:09:44 2010 From: jh at physics.ucf.edu (Joe Harrington) Date: Tue, 25 May 2010 18:09:44 -0400 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought Message-ID: On Tue, 25 May 2010 15:54:26 -0500, Travis Oliphant wrote: >On May 25, 2010, at 2:50 PM, Charles R Harris wrote: > >> >> >> On Tue, May 25, 2010 at 1:37 PM, Travis Oliphant > > wrote: >> >> Hi everyone, >> >> There has been some talk about re-factoring NumPy to separate out the >> Python C-API layer and make NumPy closer to a C-library. I know >> there are a few different ideas about what this means, and also that >> people are very busy. I also know there is a NumPy 2.0 release that >> is in the works. >> >> I'm excited to let everyone know that we (at Enthought) have been able >> to find resources (about 3 man months) to work on this re-factoring >> project and Scott and Jason (both very experienced C and Python >> programmers) are actively pursuing it. My hope is that NumPy 2.0 >> will contain this re-factoring (which should be finished just after >> SciPy 2010 --- where I'm going to organize a Sprint on NumPy which >> will include at least date-time improvements and re-factoring work). >> >> While we have specific goals for the re-factoring, we want this >> activity to be fully integrated with the NumPy community and Scott and >> Jason want to interact with the community as much as feasible as they >> suggest re-factoring changes (though they both have more experience >> with phone-conversations to resolve concerns than email chains and so >> some patience from everybody will be appreciated). >> >> Because Jason and Scott are new to this mailing list (but not new to >> NumPy), I wanted to introduce them so they would feel more >> comfortable posting questions and people would have some context as to >> what they were trying to do. >> >> Scott and Jason are both very proficient and skilled programmers and I >> have full confidence in their abilities. That said, we very much >> want the input of as many people as possible as we pursue the goal of >> grouping together more tightly the Python C-API interface layer to >> NumPy. >> >> I will be involved in some of the discussions, but am currently on a >> different project which has tight schedules and so I will only be able >> to provide limited "mailing-list" visibility. >> >> >> I think 2.0 would be a bit early for this. Is there any reason it >> couldn't be done in 2.1? What is the planned policy with regards to >> the visible interface for extensions? It would also be nice to have >> a rough idea of how the resulting code would be layered, i.e., what >> is the design for this re-factoring. Simply having a design would be >> a major step forward. > >The problem with doing it in 2.1 is that this re-factoring will >require extensions to be re-built. The visible interface to >extensions will not change, but there will likely be ABI >incompatibility. It seems prudent to do this in NumPy 2.0. >Perhaps we can also put in place the ABI-protecting indirection >approaches that David C. was suggesting earlier. > >Some aspects of the design are still being fleshed out, but the basic >idea is to separate out a core library that is as independent of the >Python C-API as possible. There will likely be at least some >dependency on the Python C-API (reference counting and error handling >and possibly others) which any interface would have to provide in a >very simple Python.h -- equivalent, for example. > >Our purpose is to allow NumPy to be integrated with other languages or >other frameworks systems without explicitly relying on CPython. >There are a lot of questions as to how this will work, and so much of >that is being worked out. Part of the reason for this mail is to >help ensure that as much of this discussion as possible takes place in >public. > >-Travis > > >> >> In any case, I think the primary goal for 2.0 should remain the >> python3k port. >> >> Chuck >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >-- >Travis Oliphant >Enthought Inc. >1-512-536-1057 >http://www.enthought.com >oliphant at enthought.com > Chuck wants to make 2.0 be the py3k release. I expressed a desire some time ago to hold 2.0 until we had reviewed docs, on the basis that this should have been done for 1.0, and 2.0 is our "real 1.0" for this and other reasons. Now Travis wants to refactor the C code in a way that breaks extensions (another "real 1.0" item). But Chuck wants to get his project out the door soon, and not without justification. NumPy is not a small project anymore and it would make sense not only to have an open community process (which we sort-of do, except when we don't), but to define that process and to conduct more of it in plain view, on the web site, rather than hiding it here in the list archives where only subscribers are likely to see it. There are many interested parties (users) who are not on the lists. It's fine (even good) not to commit to specific dates until we know them for sure, but the process and rough timeline should be laid out publicly. I hope we can agree on one item, which is that the 2.0 release should be one that we are not retroactively embarrassed about. This means it should be well tested and that the docs should be up to date for the new stuff, including examples. Whether the docs need to be reviewed or not is a different question; they should at least not be wrong. It's May 25. SciPy 2010 starts 28 June, just one month out. Chuck and Travis, would you be amenable to making a 2.0 release page on the web site and share there your timelines for making these major changes, including testing and documenting them? It would also be good to see any other planned changes listed out, the 2.0 release cutoff date, and a timeline for the release testing process. Thanks, --jh-- From cournape at gmail.com Tue May 25 18:31:09 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 26 May 2010 07:31:09 +0900 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> Message-ID: On Wed, May 26, 2010 at 4:37 AM, Travis Oliphant wrote: > > Hi everyone, > > There has been some talk about re-factoring NumPy to separate out the > Python C-API layer and make NumPy closer to a C-library. ? I know > there are a few different ideas about what this means, and also that > people are very busy. ?I also know there is a NumPy 2.0 release that > is in the works. > > I'm excited to let everyone know that we (at Enthought) have been able > to find resources (about 3 man months) to work on this re-factoring > project and Scott and Jason (both very experienced C and Python > programmers) are actively pursuing it. ? ?My hope is that NumPy 2.0 > will contain this re-factoring (which should be finished just after > SciPy 2010 --- where I'm going to organize a Sprint on NumPy which > will include at least date-time improvements and re-factoring work). This sounds great. As for how this is to be done, what would be the numpy aspects to be worked on first ? The obvious candidates are broadcasting, indexing, and ufunc ? Concerning the goal of making numpy available to other languages, is the strategy already decided ? For example, will the core C API be reference-counted though python, or will this be abstracted ? It may be not feasable in the timeframe that Enthought has in mind, but I have been wondering for some time if we could do something like LUA does. LUA has a stronger story than python in terms of embedding within other languages, and the stack may be one solution for abstracting the reference counting (only the API around the stack and its implementation would need to be aware of a particular implementation of memory management). David > > While we have specific goals for the re-factoring, we want this > activity to be fully integrated with the NumPy community and Scott and > Jason want to interact with the community as much as feasible as they > suggest re-factoring changes (though they both have more experience > with phone-conversations to resolve concerns than email chains and so > some patience from everybody will be appreciated). > > Because Jason and Scott are new to this mailing list (but not new to > NumPy), ?I wanted to introduce them so they would feel more > comfortable posting questions and people would have some context as to > what they were trying to do. > > Scott and Jason are both very proficient and skilled programmers and I > have full confidence in their abilities. ? That said, we very much > want the input of as many people as possible as we pursue the goal of > grouping together more tightly the Python C-API interface layer to > NumPy. > > I will be involved in some of the discussions, but am currently on a > different project which has tight schedules and so I will only be able > to provide limited "mailing-list" visibility. > > Best regards, > > -Travis > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue May 25 18:31:49 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 May 2010 16:31:49 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Tue, May 25, 2010 at 4:06 PM, David Cournapeau wrote: > On Wed, May 26, 2010 at 6:19 AM, Charles R Harris > wrote: > > > Sounds good, but what if it doesn't get finished in a few months? I think > we > > should get 2.0.0 out pronto, ideally it would already have been released. > I > > think a major refactoring like this proposal should get the 3.0.0 label. > > Naming it 3.0 or 2.1 does not matter much - I think we should avoid > breaking things twice. I can see a few solutions: > - postpone 2.0 "indefinitely", until this new work is done > - backport py3k changes to 1.5 (which would be API and ABI > compatible with 1.4.1), and 2.0 would contain all the breaking > changes. > If I had to choose between those, I would pick making a 1.5 release, that is, branch the current trunk and then excise datetime and all the related changes. Let me propose a schedule: - Branch 1.5 in late June. The time until then to be spent closing tickets. - Release 1.5 towards the end of July. That should be doable now that the release folks have had some practice. - Release 2.0 next spring. I don't think 3 man months is enough time to redesign/refactor numpy, get it tested, and document the changes. If we hide stuff away it will be even longer before folks who have written extensions can make the needed changes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 25 18:59:29 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 May 2010 16:59:29 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> Message-ID: On Tue, May 25, 2010 at 4:31 PM, David Cournapeau wrote: > On Wed, May 26, 2010 at 4:37 AM, Travis Oliphant > wrote: > > > > Hi everyone, > > > > There has been some talk about re-factoring NumPy to separate out the > > Python C-API layer and make NumPy closer to a C-library. I know > > there are a few different ideas about what this means, and also that > > people are very busy. I also know there is a NumPy 2.0 release that > > is in the works. > > > > I'm excited to let everyone know that we (at Enthought) have been able > > to find resources (about 3 man months) to work on this re-factoring > > project and Scott and Jason (both very experienced C and Python > > programmers) are actively pursuing it. My hope is that NumPy 2.0 > > will contain this re-factoring (which should be finished just after > > SciPy 2010 --- where I'm going to organize a Sprint on NumPy which > > will include at least date-time improvements and re-factoring work). > > This sounds great. As for how this is to be done, what would be the > numpy aspects to be worked on first ? The obvious candidates are > broadcasting, indexing, and ufunc ? > > If it was ufuncs alone it could be broken out into a separate project where ufuncs operated on objects that exposed the buffer interface. This would keep it separate from from numpy and at some point we could drop it into numpy trunk. If things went that way it would also make sense to drop support for versions of python that don't support the new buffer protocol. Speaking of which, can the datetime types be supported with the buffer interface? If not, there is some design work to be done there as to how to use the buffer interface for new types. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Tue May 25 21:23:50 2010 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 25 May 2010 18:23:50 -0700 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On 25 May 2010 15:06, David Cournapeau wrote: > Naming it 3.0 or 2.1 does not matter much - I think we should avoid > breaking things twice. I can see a few solutions: > ?- postpone 2.0 "indefinitely", until this new work is done > ?- backport py3k changes to 1.5 (which would be API and ABI > compatible with 1.4.1), and 2.0 would contain all the breaking > changes. This is a good suggestion. Release 1.5 without ABI breakage and then leave enough time to discuss an optimal API, refactor the C code and include datetime functionality for 2.0. We don't stand anything to gain by rushing. If I'm not mistaken, David did warn that this kind of situation may occur the last time around :) Cheers St?fan From charlesr.harris at gmail.com Tue May 25 22:10:55 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 May 2010 20:10:55 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: 2010/5/25 St?fan van der Walt > On 25 May 2010 15:06, David Cournapeau wrote: > > Naming it 3.0 or 2.1 does not matter much - I think we should avoid > > breaking things twice. I can see a few solutions: > > - postpone 2.0 "indefinitely", until this new work is done > > - backport py3k changes to 1.5 (which would be API and ABI > > compatible with 1.4.1), and 2.0 would contain all the breaking > > changes. > > This is a good suggestion. Release 1.5 without ABI breakage and then > leave enough time to discuss an optimal API, refactor the C code and > include datetime functionality for 2.0. We don't stand anything to > gain by rushing. > > If I'm not mistaken, David did warn that this kind of situation may > occur the last time around :) > > IIRC, David was seeing a refactor a year or two off, if ever. I'm concerned that the refactor will go on and on and on, not least because I haven't seen any plan or discussion as to what the precise goals of the refactor are, much less a plan for how to get there. It's hard to have a sprint when no one knows what they are trying to achieve. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsyu80 at gmail.com Tue May 25 22:21:25 2010 From: tsyu80 at gmail.com (Tony S Yu) Date: Tue, 25 May 2010 22:21:25 -0400 Subject: [Numpy-discussion] Bug in nanmin called with unsigned integers Message-ID: <60119391-07E5-4E79-9283-B1960E524B5F@gmail.com> I got bit again by this bug with unsigned integers. (My original changes got overwritten when I updated from svn and, unfortunately, merged conflicts without actually looking over the changes.) In any case, I thought it'd be a good time to bump the issue (with patch). Cheers, -Tony PS: Just for context, this issue comes up when displaying images with Chaco (which converts images to unsigned integer arrays and calls nanmin). -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 25 22:34:40 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 May 2010 20:34:40 -0600 Subject: [Numpy-discussion] Geometric, negative binomial and poisson fail for extreme arguments Message-ID: Josef, This is ticket #896 from two years ago. IIRC, there was some more recent discussion on the list of some of these. Do you know what the current state of these distributions is? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue May 25 22:57:06 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 May 2010 20:57:06 -0600 Subject: [Numpy-discussion] Bug in nanmin called with unsigned integers In-Reply-To: <60119391-07E5-4E79-9283-B1960E524B5F@gmail.com> References: <60119391-07E5-4E79-9283-B1960E524B5F@gmail.com> Message-ID: On Tue, May 25, 2010 at 8:21 PM, Tony S Yu wrote: > I got bit again by this bug with unsigned integers. > (My original changes got overwritten when I updated from svn and, > unfortunately, merged conflicts without actually looking over the changes.) > > In any case, I thought it'd be a good time to bump the issue (with patch > ). > > Cheers, > -Tony > > PS: Just for context, this issue comes up when displaying images with Chaco > (which converts images to unsigned integer arrays and calls nanmin). > > Fixed in r8445. Please add some tests. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue May 25 23:20:51 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 25 May 2010 23:20:51 -0400 Subject: [Numpy-discussion] Geometric, negative binomial and poisson fail for extreme arguments In-Reply-To: References: Message-ID: On Tue, May 25, 2010 at 10:34 PM, Charles R Harris wrote: > Josef, > > This is ticket #896 from two years ago. IIRC, there was some more recent > discussion on the list of some of these. Do you know what the current state > of these distributions is? I don't have any information on these and I don't remember any discussion (and a quick search didn't find anything). I never looked at the integer overflow problem, besides reading the ticket. All 3 distributions are used in scipy.stats and tested for some regular values. (my not very strong opinion: for consistency with the other distributions, I would go with Robert's approach of rejecting overflow samples. I don't know any application where the truncation would have a significant effect. In scipy.stats I switched to returning floats instead of integers for ppf, because we need inf and nans.) BTW: If you are fixing things in np.random, then depreciating and renaming pareto as we discussed recently on the list would help reduce some confusion. I don't think we filed a ticket. Josef > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From tanwp at gis.a-star.edu.sg Tue May 25 23:25:11 2010 From: tanwp at gis.a-star.edu.sg (Padma TAN) Date: Wed, 26 May 2010 11:25:11 +0800 Subject: [Numpy-discussion] FW: Numpy python build In-Reply-To: Message-ID: Hi, Can I just install numpy and scipy without ATLAS? And what does this means " gnu: no Fortran 90 compiler found"? Im installing on RHEL Thanks in advance! [root at giswk002 numpy-1.3.0]# python setup.py build Running from numpy source directory. non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /usr/local/Python-2.6.2/lib libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/Python-2.6.2/lib libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /usr/local/Python-2.6.2/lib libraries f77blas,cblas,atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib/sse2 libraries f77blas,cblas,atlas not found in /usr/lib NOT AVAILABLE /usr/local/numpy-1.3.0/numpy/distutils/system_info.py:1383: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) blas_info: libraries blas not found in /usr/local/Python-2.6.2/lib libraries blas not found in /usr/local/lib FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] language = f77 FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /usr/local/Python-2.6.2/lib libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/Python-2.6.2/lib libraries lapack_atlas not found in /usr/local/Python-2.6.2/lib libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries lapack_atlas not found in /usr/lib/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in /usr/local/Python-2.6.2/lib libraries lapack_atlas not found in /usr/local/Python-2.6.2/lib libraries f77blas,cblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib/sse2 libraries lapack_atlas not found in /usr/lib/sse2 libraries f77blas,cblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_info NOT AVAILABLE /usr/local/numpy-1.3.0/numpy/distutils/system_info.py:1290: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) lapack_info: libraries lapack not found in /usr/local/Python-2.6.2/lib libraries lapack not found in /usr/local/lib FOUND: libraries = ['lapack'] library_dirs = ['/usr/lib'] language = f77 FOUND: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src building py_modules sources building library "npymath" sources building extension "numpy.core._sort" sources adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h' to sources. numpy/core/code_generators/genapi.py:9: DeprecationWarning: the md5 module is deprecated; use hashlib instead import md5 executing numpy/core/code_generators/generate_numpy_api.py adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h' to sources. numpy.core - nothing done with h_files = ['build/src.linux-i686-2.6/numpy/core/include/numpy/config.h', 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h'] building extension "numpy.core.multiarray" sources adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_numpy_api.py adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h' to sources. adding 'build/src.linux-i686-2.6/numpy/core/src' to include_dirs. numpy.core - nothing done with h_files = ['build/src.linux-i686-2.6/numpy/core/src/scalartypes.inc', 'build/src.linux-i686-2.6/numpy/core/src/arraytypes.inc', 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h', 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h'] building extension "numpy.core.umath" sources adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/__ufunc_api.h' to sources. adding 'build/src.linux-i686-2.6/numpy/core/src' to include_dirs. numpy.core - nothing done with h_files = ['build/src.linux-i686-2.6/numpy/core/src/scalartypes.inc', 'build/src.linux-i686-2.6/numpy/core/src/arraytypes.inc', 'build/src.linux-i686-2.6/numpy/core/src/umath_funcs.inc', 'build/src.linux-i686-2.6/numpy/core/src/umath_loops.inc', 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h', 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-i686-2.6/numpy/core/include/numpy/__ufunc_api.h'] building extension "numpy.core.scalarmath" sources adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_numpy_api.py adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/__ufunc_api.h' to sources. numpy.core - nothing done with h_files = ['build/src.linux-i686-2.6/numpy/core/include/numpy/config.h', 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h', 'build/src.linux-i686-2.6/numpy/core/include/numpy/__ufunc_api.h'] building extension "numpy.core._dotblas" sources building extension "numpy.core.umath_tests" sources building extension "numpy.lib._compiled_base" sources building extension "numpy.numarray._capi" sources building extension "numpy.fft.fftpack_lite" sources building extension "numpy.linalg.lapack_lite" sources adding 'numpy/linalg/lapack_litemodule.c' to sources. adding 'numpy/linalg/python_xerbla.c' to sources. building extension "numpy.random.mtrand" sources /usr/local/numpy-1.3.0/numpy/distutils/command/config.py:39: DeprecationWarning: +++++++++++++++++++++++++++++++++++++++++++++++++ Usage of try_run is deprecated: please do not use it anymore, and avoid configuration checks involving running executable on the target machine. +++++++++++++++++++++++++++++++++++++++++++++++++ DeprecationWarning) customize GnuFCompiler Found executable /usr/bin/g77 gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using config C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/local/Python-2.6.2/include/python2.6 -c' gcc: _configtest.c gcc -pthread _configtest.o -o _configtest _configtest failure. removing: _configtest.c _configtest.o _configtest building data_files sources running build_py copying numpy/version.py -> build/lib.linux-i686-2.6/numpy copying build/src.linux-i686-2.6/numpy/__config__.py -> build/lib.linux-i686-2.6/numpy copying build/src.linux-i686-2.6/numpy/distutils/__config__.py -> build/lib.linux-i686-2.6/numpy/distutils running build_clib customize UnixCCompiler customize UnixCCompiler using build_clib running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using build_ext running scons running build_scripts adding 'build/scripts.linux-i686-2.6/f2py' to scripts ------ End of Forwarded Message From charlesr.harris at gmail.com Tue May 25 23:35:58 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 May 2010 21:35:58 -0600 Subject: [Numpy-discussion] Geometric, negative binomial and poisson fail for extreme arguments In-Reply-To: References: Message-ID: On Tue, May 25, 2010 at 9:20 PM, wrote: > On Tue, May 25, 2010 at 10:34 PM, Charles R Harris > wrote: > > Josef, > > > > This is ticket #896 from two years ago. IIRC, there was some more recent > > discussion on the list of some of these. Do you know what the current > state > > of these distributions is? > > I don't have any information on these and I don't remember any > discussion (and a quick search didn't find anything). I never looked > at the integer overflow problem, besides reading the ticket. > > All 3 distributions are used in scipy.stats and tested for some regular > values. > > (my not very strong opinion: for consistency with the other > distributions, I would go with Robert's approach of rejecting overflow > samples. I don't know any application where the truncation would have > a significant effect. > In scipy.stats I switched to returning floats instead of integers for > ppf, because we need inf and nans.) > > BTW: If you are fixing things in np.random, then depreciating and > renaming pareto as we discussed recently on the list would help reduce > some confusion. I don't think we filed a ticket. > > OK, but it would help if you did file a ticket. And if you think truncation is the way to go on the #896 could you post a note there also? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Wed May 26 00:19:37 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Tue, 25 May 2010 23:19:37 -0500 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: >> >> I think 2.0 would be a bit early for this. Is there any reason it couldn't be done in 2.1? What is the planned policy with regards to the visible interface for extensions? It would also be nice to have a rough idea of how the resulting code would be layered, i.e., what is the design for this re-factoring. Simply having a design would be a major step forward. > > The problem with doing it in 2.1 is that this re-factoring will require extensions to be re-built. The visible interface to extensions will not change, but there will likely be ABI incompatibility. It seems prudent to do this in NumPy 2.0. Perhaps we can also put in place the ABI-protecting indirection approaches that David C. was suggesting earlier. > > Some aspects of the design are still being fleshed out, but the basic idea is to separate out a core library that is as independent of the Python C-API as possible. There will likely be at least some dependency on the Python C-API (reference counting and error handling and possibly others) which any interface would have to provide in a very simple Python.h -- equivalent, for example. > > Our purpose is to allow NumPy to be integrated with other languages or other frameworks systems without explicitly relying on CPython. There are a lot of questions as to how this will work, and so much of that is being worked out. Part of the reason for this mail is to help ensure that as much of this discussion as possible takes place in public. > > > Sounds good, but what if it doesn't get finished in a few months? I think we should get 2.0.0 out pronto, ideally it would already have been released. I think a major refactoring like this proposal should get the 3.0.0 label. Admittedly that makes keeping a refactored branch current with fixes going into the trunk a hassle, but perhaps that can be worked around somewhat by clearly labeling what files will be touched in the refactoring and possibly rearranging the content of the existing files. This requires a game plan and a clear idea of the goal. Put simply, I think the proposed schedule is too ambitious and needs to be fleshed out. This refactoring isn't going to be as straight forward as the python3k port because a lot of design decisions need to be made along the way. You are correct that there is not much time. However, our timeline is middle of July and we do have dedicated resources. I was also hoping to have discussions at SciPy to accelerate the process. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Wed May 26 00:22:18 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Tue, 25 May 2010 23:22:18 -0500 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On May 25, 2010, at 4:49 PM, David Goldsmith wrote: > Travis: do you already have a place on the NumPy Development Wiki where you're (b)logging your design decisions? Seems like a good way for concerned parties to monitor your choices in more or less real time and thus provide comment in a timely fashion. This is a great idea of course and we will definitely post progess there. So far, the code has been reviewed and several functions identified for re-factoring. This is taking place in a github branch of numpy called numpy refactor. -Travis > > DG > > On Tue, May 25, 2010 at 2:19 PM, Charles R Harris wrote: > > > On Tue, May 25, 2010 at 2:54 PM, Travis Oliphant wrote: > > On May 25, 2010, at 2:50 PM, Charles R Harris wrote: > >> >> >> On Tue, May 25, 2010 at 1:37 PM, Travis Oliphant wrote: >> >> Hi everyone, >> >> There has been some talk about re-factoring NumPy to separate out the >> Python C-API layer and make NumPy closer to a C-library. I know >> there are a few different ideas about what this means, and also that >> people are very busy. I also know there is a NumPy 2.0 release that >> is in the works. >> >> I'm excited to let everyone know that we (at Enthought) have been able >> to find resources (about 3 man months) to work on this re-factoring >> project and Scott and Jason (both very experienced C and Python >> programmers) are actively pursuing it. My hope is that NumPy 2.0 >> will contain this re-factoring (which should be finished just after >> SciPy 2010 --- where I'm going to organize a Sprint on NumPy which >> will include at least date-time improvements and re-factoring work). >> >> While we have specific goals for the re-factoring, we want this >> activity to be fully integrated with the NumPy community and Scott and >> Jason want to interact with the community as much as feasible as they >> suggest re-factoring changes (though they both have more experience >> with phone-conversations to resolve concerns than email chains and so >> some patience from everybody will be appreciated). >> >> Because Jason and Scott are new to this mailing list (but not new to >> NumPy), I wanted to introduce them so they would feel more >> comfortable posting questions and people would have some context as to >> what they were trying to do. >> >> Scott and Jason are both very proficient and skilled programmers and I >> have full confidence in their abilities. That said, we very much >> want the input of as many people as possible as we pursue the goal of >> grouping together more tightly the Python C-API interface layer to >> NumPy. >> >> I will be involved in some of the discussions, but am currently on a >> different project which has tight schedules and so I will only be able >> to provide limited "mailing-list" visibility. >> >> >> I think 2.0 would be a bit early for this. Is there any reason it couldn't be done in 2.1? What is the planned policy with regards to the visible interface for extensions? It would also be nice to have a rough idea of how the resulting code would be layered, i.e., what is the design for this re-factoring. Simply having a design would be a major step forward. > > The problem with doing it in 2.1 is that this re-factoring will require extensions to be re-built. The visible interface to extensions will not change, but there will likely be ABI incompatibility. It seems prudent to do this in NumPy 2.0. Perhaps we can also put in place the ABI-protecting indirection approaches that David C. was suggesting earlier. > > Some aspects of the design are still being fleshed out, but the basic idea is to separate out a core library that is as independent of the Python C-API as possible. There will likely be at least some dependency on the Python C-API (reference counting and error handling and possibly others) which any interface would have to provide in a very simple Python.h -- equivalent, for example. > > Our purpose is to allow NumPy to be integrated with other languages or other frameworks systems without explicitly relying on CPython. There are a lot of questions as to how this will work, and so much of that is being worked out. Part of the reason for this mail is to help ensure that as much of this discussion as possible takes place in public. > > > Sounds good, but what if it doesn't get finished in a few months? I think we should get 2.0.0 out pronto, ideally it would already have been released. I think a major refactoring like this proposal should get the 3.0.0 label. Admittedly that makes keeping a refactored branch current with fixes going into the trunk a hassle, but perhaps that can be worked around somewhat by clearly labeling what files will be touched in the refactoring and possibly rearranging the content of the existing files. This requires a game plan and a clear idea of the goal. Put simply, I think the proposed schedule is too ambitious and needs to be fleshed out. This refactoring isn't going to be as straight forward as the python3k port because a lot of design decisions need to be made along the way. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. > > Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion --- Travis Oliphant Enthought, Inc. oliphant at enthought.com 1-512-536-1057 http://www.enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Wed May 26 00:23:43 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Tue, 25 May 2010 23:23:43 -0500 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On May 25, 2010, at 5:06 PM, David Cournapeau wrote: > On Wed, May 26, 2010 at 6:19 AM, Charles R Harris > wrote: > >> Sounds good, but what if it doesn't get finished in a few months? I think we >> should get 2.0.0 out pronto, ideally it would already have been released. I >> think a major refactoring like this proposal should get the 3.0.0 label. > > Naming it 3.0 or 2.1 does not matter much - I think we should avoid > breaking things twice. I can see a few solutions: > - postpone 2.0 "indefinitely", until this new work is done > - backport py3k changes to 1.5 (which would be API and ABI > compatible with 1.4.1), and 2.0 would contain all the breaking > changes. This is an interesting idea and also workable. > > I am really worried about breaking things once now and once in a few > months (or even a year). I am too. That's why this discussion. We will have the NumPy refactor done by end of July at the latest. Numpy 2.0 should be able to come out in August. -Travis > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion --- Travis Oliphant Enthought, Inc. oliphant at enthought.com 1-512-536-1057 http://www.enthought.com From stefan at sun.ac.za Wed May 26 00:36:33 2010 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 25 May 2010 21:36:33 -0700 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On 25 May 2010 21:22, Travis Oliphant wrote: > This is a great idea of course and we will definitely post progess there. > So far, the code has been reviewed and several functions identified for > re-factoring. ? This is taking place in a github branch of numpy called > numpy refactor. Awesome! Since github now supports SVN interaction, and all the core devs use Git, now might be a good time to move the entire numpy source tree? It will certainly make it easier to merge the refactor changes! Regards St?fan From d.l.goldsmith at gmail.com Wed May 26 02:21:20 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 25 May 2010 23:21:20 -0700 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Tue, May 25, 2010 at 9:22 PM, Travis Oliphant wrote: > > On May 25, 2010, at 4:49 PM, David Goldsmith wrote: > > Travis: do you already have a place on the NumPy Development Wikiwhere you're (b)logging your design decisions? Seems like a good way for > concerned parties to monitor your choices in more or less real time and thus > provide comment in a timely fashion. > > > This is a great idea of course and we will definitely post progess there. > > Thanks; specific URL please, when available; plus, prominently feature (a link to) the location on the Development Wiki home page, at the very least (i.e., if not also on the NumPy home page). > So far, the code has been reviewed, > I.e., the existing code, yes? > and several functions identified for re-factoring. > Please enumerate on the "Wiki Refactoring Log" (name tentative - I don't care what we call it, just so long as it exists, its purpose is clear, and we all know where it is). This is taking place in a github branch of numpy called numpy refactor. > "This" = the actual creation/modification of code, yes? DG > > -Travis > > > DG > > On Tue, May 25, 2010 at 2:19 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Tue, May 25, 2010 at 2:54 PM, Travis Oliphant wrote: >> >>> >>> On May 25, 2010, at 2:50 PM, Charles R Harris wrote: >>> >>> >>> >>> On Tue, May 25, 2010 at 1:37 PM, Travis Oliphant >> > wrote: >>> >>>> >>>> Hi everyone, >>>> >>>> There has been some talk about re-factoring NumPy to separate out the >>>> Python C-API layer and make NumPy closer to a C-library. I know >>>> there are a few different ideas about what this means, and also that >>>> people are very busy. I also know there is a NumPy 2.0 release that >>>> is in the works. >>>> >>>> I'm excited to let everyone know that we (at Enthought) have been able >>>> to find resources (about 3 man months) to work on this re-factoring >>>> project and Scott and Jason (both very experienced C and Python >>>> programmers) are actively pursuing it. My hope is that NumPy 2.0 >>>> will contain this re-factoring (which should be finished just after >>>> SciPy 2010 --- where I'm going to organize a Sprint on NumPy which >>>> will include at least date-time improvements and re-factoring work). >>>> >>>> While we have specific goals for the re-factoring, we want this >>>> activity to be fully integrated with the NumPy community and Scott and >>>> Jason want to interact with the community as much as feasible as they >>>> suggest re-factoring changes (though they both have more experience >>>> with phone-conversations to resolve concerns than email chains and so >>>> some patience from everybody will be appreciated). >>>> >>>> Because Jason and Scott are new to this mailing list (but not new to >>>> NumPy), I wanted to introduce them so they would feel more >>>> comfortable posting questions and people would have some context as to >>>> what they were trying to do. >>>> >>>> Scott and Jason are both very proficient and skilled programmers and I >>>> have full confidence in their abilities. That said, we very much >>>> want the input of as many people as possible as we pursue the goal of >>>> grouping together more tightly the Python C-API interface layer to >>>> NumPy. >>>> >>>> I will be involved in some of the discussions, but am currently on a >>>> different project which has tight schedules and so I will only be able >>>> to provide limited "mailing-list" visibility. >>>> >>>> >>> I think 2.0 would be a bit early for this. Is there any reason it >>> couldn't be done in 2.1? What is the planned policy with regards to the >>> visible interface for extensions? It would also be nice to have a rough idea >>> of how the resulting code would be layered, i.e., what is the design for >>> this re-factoring. Simply having a design would be a major step forward. >>> >>> >>> The problem with doing it in 2.1 is that this re-factoring will require >>> extensions to be re-built. The visible interface to extensions will not >>> change, but there will likely be ABI incompatibility. It seems prudent to >>> do this in NumPy 2.0. Perhaps we can also put in place the ABI-protecting >>> indirection approaches that David C. was suggesting earlier. >>> >>> Some aspects of the design are still being fleshed out, but the basic >>> idea is to separate out a core library that is as independent of the Python >>> C-API as possible. There will likely be at least some dependency on the >>> Python C-API (reference counting and error handling and possibly others) >>> which any interface would have to provide in a very simple Python.h -- >>> equivalent, for example. >>> >>> Our purpose is to allow NumPy to be integrated with other languages or >>> other frameworks systems without explicitly relying on CPython. There are >>> a lot of questions as to how this will work, and so much of that is being >>> worked out. Part of the reason for this mail is to help ensure that as >>> much of this discussion as possible takes place in public. >>> >>> >> Sounds good, but what if it doesn't get finished in a few months? I think >> we should get 2.0.0 out pronto, ideally it would already have been released. >> I think a major refactoring like this proposal should get the 3.0.0 label. >> Admittedly that makes keeping a refactored branch current with fixes going >> into the trunk a hassle, but perhaps that can be worked around somewhat by >> clearly labeling what files will be touched in the refactoring and possibly >> rearranging the content of the existing files. This requires a game plan and >> a clear idea of the goal. Put simply, I think the proposed schedule is too >> ambitious and needs to be fleshed out. This refactoring isn't going to be >> as straight forward as the python3k port because a lot of design decisions >> need to be made along the way. >> >> Chuck >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > Mathematician: noun, someone who disavows certainty when their uncertainty > set is non-empty, even if that set has measure zero. > > Hope: noun, that delusive spirit which escaped Pandora's jar and, with her > lies, prevents mankind from committing a general suicide. (As interpreted > by Robert Graves) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > --- > Travis Oliphant > Enthought, Inc. > oliphant at enthought.com > 1-512-536-1057 > http://www.enthought.com > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves) -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.walter at gmail.com Wed May 26 04:50:19 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Wed, 26 May 2010 10:50:19 +0200 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: I'm a potential user of the C-API and therefore I'm very interested in the outcome. In the previous discussion (http://comments.gmane.org/gmane.comp.python.numeric.general/37409) many different views on what the new C-API "should" be were expressed. Naturally, I wonder if the new C-API will be useful for my purposes. So, I'm not so excited about a "refactoring log" where only the progress is reported. I fear that some (potentially minor) design decisions would render the new C-API useless for me. So my question is: Does this "refactoring log" also include something like a Numpy Enhancement Proposal? Something that can be discussed beforehand? I.e., will there be a detailed description (i.e. code examples) what the goal of the refactoring is? If there is any interest, I could provide some simple test examples in C that would explain what I'd like to be able to do with the new C-API. Sebastian On Wed, May 26, 2010 at 8:21 AM, David Goldsmith wrote: > On Tue, May 25, 2010 at 9:22 PM, Travis Oliphant > wrote: >> >> On May 25, 2010, at 4:49 PM, David Goldsmith wrote: >> >> Travis: do you already have a place on the NumPy Development Wiki where >> you're (b)logging your design decisions?? Seems like a good way for >> concerned parties to monitor your choices in more or less real time and thus >> provide comment in a timely fashion. >> >> This is a great idea of course and we will definitely post progess there. >> > > Thanks; specific URL please, when available; plus, prominently feature (a > link to) the location on the Development Wiki home page, at the very least > (i.e., if not also on the NumPy home page). > >> >> So far, the code has been reviewed, > > I.e., the existing code, yes? > >> >> and several functions identified for re-factoring. > > Please enumerate on the "Wiki Refactoring Log" (name tentative - I don't > care what we call it, just so long as it exists, its purpose is clear, and > we all know where it is). > >> This is taking place in a github branch of numpy called numpy refactor. > > "This" = the actual creation/modification of code, yes? > > DG >> >> -Travis >> >> DG >> >> On Tue, May 25, 2010 at 2:19 PM, Charles R Harris >> wrote: >>> >>> >>> On Tue, May 25, 2010 at 2:54 PM, Travis Oliphant >>> wrote: >>>> >>>> On May 25, 2010, at 2:50 PM, Charles R Harris wrote: >>>> >>>> >>>> On Tue, May 25, 2010 at 1:37 PM, Travis Oliphant >>>> wrote: >>>>> >>>>> Hi everyone, >>>>> >>>>> There has been some talk about re-factoring NumPy to separate out the >>>>> Python C-API layer and make NumPy closer to a C-library. ? I know >>>>> there are a few different ideas about what this means, and also that >>>>> people are very busy. ?I also know there is a NumPy 2.0 release that >>>>> is in the works. >>>>> >>>>> I'm excited to let everyone know that we (at Enthought) have been able >>>>> to find resources (about 3 man months) to work on this re-factoring >>>>> project and Scott and Jason (both very experienced C and Python >>>>> programmers) are actively pursuing it. ? ?My hope is that NumPy 2.0 >>>>> will contain this re-factoring (which should be finished just after >>>>> SciPy 2010 --- where I'm going to organize a Sprint on NumPy which >>>>> will include at least date-time improvements and re-factoring work). >>>>> >>>>> While we have specific goals for the re-factoring, we want this >>>>> activity to be fully integrated with the NumPy community and Scott and >>>>> Jason want to interact with the community as much as feasible as they >>>>> suggest re-factoring changes (though they both have more experience >>>>> with phone-conversations to resolve concerns than email chains and so >>>>> some patience from everybody will be appreciated). >>>>> >>>>> Because Jason and Scott are new to this mailing list (but not new to >>>>> NumPy), ?I wanted to introduce them so they would feel more >>>>> comfortable posting questions and people would have some context as to >>>>> what they were trying to do. >>>>> >>>>> Scott and Jason are both very proficient and skilled programmers and I >>>>> have full confidence in their abilities. ? That said, we very much >>>>> want the input of as many people as possible as we pursue the goal of >>>>> grouping together more tightly the Python C-API interface layer to >>>>> NumPy. >>>>> >>>>> I will be involved in some of the discussions, but am currently on a >>>>> different project which has tight schedules and so I will only be able >>>>> to provide limited "mailing-list" visibility. >>>>> >>>> >>>> I think 2.0 would be a bit early for this. Is there any reason it >>>> couldn't be done in 2.1? What is the planned policy with regards to the >>>> visible interface for extensions? It would also be nice to have a rough idea >>>> of how the resulting code would be layered, i.e., what is the design for >>>> this re-factoring. Simply having a design would be a major step forward. >>>> >>>> The problem with doing it in 2.1 is that this re-factoring will require >>>> extensions to be re-built. ? The visible interface to extensions will not >>>> change, but there will likely be ABI incompatibility. ? ?It seems prudent to >>>> do this in NumPy 2.0. ? Perhaps we can also put in place the ABI-protecting >>>> indirection approaches that David C. was suggesting earlier. >>>> Some aspects of the design are still being fleshed out, but the basic >>>> idea is to separate out a core library that is as independent of the Python >>>> C-API as possible. ? ?There will likely be at least some dependency on the >>>> Python C-API (reference counting and error handling and possibly others) >>>> which any interface would have to provide in a very simple Python.h -- >>>> equivalent, for example. >>>> Our purpose is to allow NumPy to be integrated with other languages or >>>> other frameworks systems without explicitly relying on CPython. ? ?There are >>>> a lot of questions as to how this will work, and so much of that is being >>>> worked out. ? Part of the reason for this mail is to help ensure that as >>>> much of this discussion as possible takes place in public. >>> >>> Sounds good, but what if it doesn't get finished in a few months? I think >>> we should get 2.0.0 out pronto, ideally it would already have been released. >>> I think a major refactoring like this proposal should get the 3.0.0 label. >>> Admittedly that makes keeping a refactored branch current with fixes going >>> into the trunk a hassle, but perhaps that can be worked around somewhat by >>> clearly labeling what files will be touched in the refactoring and possibly >>> rearranging the content of the existing files. This requires a game plan and >>> a clear idea of the goal. Put simply, I think the proposed schedule is too >>> ambitious and needs to be fleshed out.? This refactoring isn't going to be >>> as straight forward as the python3k port because a lot of design decisions >>> need to be made along the way. >>> >>> Chuck >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> >> -- >> Mathematician: noun, someone who disavows certainty when their uncertainty >> set is non-empty, even if that set has measure zero. >> >> Hope: noun, that delusive spirit which escaped Pandora's jar and, with her >> lies, prevents mankind from committing a general suicide. ?(As interpreted >> by Robert Graves) >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> --- >> Travis Oliphant >> Enthought, Inc. >> oliphant at enthought.com >> 1-512-536-1057 >> http://www.enthought.com >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > Mathematician: noun, someone who disavows certainty when their uncertainty > set is non-empty, even if that set has measure zero. > > Hope: noun, that delusive spirit which escaped Pandora's jar and, with her > lies, prevents mankind from committing a general suicide. ?(As interpreted > by Robert Graves) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From pav at iki.fi Wed May 26 04:59:08 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 26 May 2010 08:59:08 +0000 (UTC) Subject: [Numpy-discussion] Extending documentation to c code References: Message-ID: Wed, 26 May 2010 06:57:27 +0900, David Cournapeau wrote: [clip: doxygen] > It is yet another format to use inside C sources (I don't think doxygen > supports rest), and I would rather have something that is similar, > ideally integrated into sphinx. It also generates rather ugly doc by > default, Anyway, we can probably nevertheless just agree on a readable plain-text/ rst format, and then just use doxygen to generate the docs, as a band-aid. http://github.com/pv/numpycdoc -- Pauli Virtanen From pav at iki.fi Wed May 26 06:31:21 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 26 May 2010 10:31:21 +0000 (UTC) Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: Wed, 26 May 2010 10:50:19 +0200, Sebastian Walter wrote: > I'm a potential user of the C-API and therefore I'm very interested in > the outcome. > In the previous discussion > (http://comments.gmane.org/gmane.comp.python.numeric.general/37409) many > different views on what the new C-API "should" be were expressed. I believe the aim of the refactor is to *not* change the C-API at all, but separate it internally from the routines that do the heavy lifting. Externally, Numpy would still look the same, but be more easy to maintain. The routines that do the heavy lifting could then be good for reuse and be more easy to maintain, but I think how and where they would be exposed hasn't been discussed so far... -- Pauli Virtanen From sebastian.walter at gmail.com Wed May 26 07:40:21 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Wed, 26 May 2010 13:40:21 +0200 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 12:31 PM, Pauli Virtanen wrote: > Wed, 26 May 2010 10:50:19 +0200, Sebastian Walter wrote: >> I'm a potential user of the C-API and therefore I'm very interested in >> the outcome. >> In the previous discussion >> (http://comments.gmane.org/gmane.comp.python.numeric.general/37409) many >> different views on what the new C-API "should" be were expressed. > > I believe the aim of the refactor is to *not* change the C-API at all, > but separate it internally from the routines that do the heavy lifting. > Externally, Numpy would still look the same, but be more easy to maintain. Sorry for the confusion. By C-API I meant a C-API that would be independent of the CPython API. > > The routines that do the heavy lifting could then be good for reuse and > be more easy to maintain, but I think how and where they would be exposed > hasn't been discussed so far... I had the impression that the goal is not only to have code that is easier to maintain but to give developers the possibility to use numpy functionality (broadcasting, ufuncs, ...) within C code without having to use CPython API (refcounts, construction of PyObjects etc.). > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From usenet at bersch.net Wed May 26 08:07:06 2010 From: usenet at bersch.net (Christoph Bersch) Date: Wed, 26 May 2010 14:07:06 +0200 Subject: [Numpy-discussion] Reading and writing binary data to/from file objects Message-ID: <4BFD0EEA.3010409@bersch.net> Hi, I want to read binary data from a file using fromfile. This works as long as I use a file name as argument to fromfile. With a file object the data is wrong! Consider the following example: from numpy import * fname='file.bin' fname2='file2.bin' a = arange(1, 30) print type(a[0]) print "\noriginal data" print a # write to file name a.tofile(fname) # write to file object f = open(fname2, 'w') a.tofile(f) f.close() print "\nWritten to file name, read from file name" b = fromfile(fname, dtype=int32) print b print "\nWritten to file name, read from file object" f = open(fname, 'r') b = fromfile(f, dtype=int32) f.close() print b print "\nWritten to file object, read from file name" b = fromfile(fname2, dtype=int32) print b print "\nWritten to file object, read from file object" f = open(fname2, 'r') b = fromfile(f, dtype=int32) f.close() print b This prints: original data [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29] Written to file name, read from file name [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29] Written to file name, read from file object [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0 0 0 0] Written to file object, read from file name [ 1 2 3 4 5 6 7 8 9 2573 2816 3072 3328 3584 3840 4096 4352 4608 4864 5120 5376 5632 5888 6144 6400 6656 6912 7168 7424] Written to file object, read from file object [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0 0 0 0] Size of 'file.bin' (written to file name) is 116 Bytes, size of 'file2.bin' (written to file object) is 117 Bytes! So the only correct data is when writing to a file name and reading from a file name. All other variants yield wrong data! This happens with Numpy 1.4.1 with Python 2.6 under Windows XP. For comparison I only have a 1.1.0 Numpy with Python 2.5 under Linux Debian where in all cases I get the same, correct arrays! Am I doing something substantially wrong or is this a bug? Christoph From pav at iki.fi Wed May 26 08:14:03 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 26 May 2010 14:14:03 +0200 Subject: [Numpy-discussion] Reading and writing binary data to/from file objects In-Reply-To: <4BFD0EEA.3010409@bersch.net> References: <4BFD0EEA.3010409@bersch.net> Message-ID: <1274876043.2270.139.camel@talisman> ke, 2010-05-26 kello 14:07 +0200, Christoph Bersch kirjoitti: > f = open(fname2, 'w') [clip] > Am I doing something substantially wrong or is this a bug? You are opening files in text mode. Use mode 'wb' instead. -- Pauli Virtanen From usenet at bersch.net Wed May 26 08:24:47 2010 From: usenet at bersch.net (Christoph Bersch) Date: Wed, 26 May 2010 14:24:47 +0200 Subject: [Numpy-discussion] Reading and writing binary data to/from file objects In-Reply-To: <1274876043.2270.139.camel@talisman> References: <4BFD0EEA.3010409@bersch.net> <1274876043.2270.139.camel@talisman> Message-ID: <4BFD130F.4000706@bersch.net> Pauli Virtanen schrieb: > ke, 2010-05-26 kello 14:07 +0200, Christoph Bersch kirjoitti: >> f = open(fname2, 'w') > [clip] >> Am I doing something substantially wrong or is this a bug? > > You are opening files in text mode. Use mode 'wb' instead. That was it, thank you! Linux does not seem to care about binary or text mode. As I develop under Linux I was a bit puzzled by the different behaviour on Windows. Christoph From charlesr.harris at gmail.com Wed May 26 09:11:28 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 07:11:28 -0600 Subject: [Numpy-discussion] __eq__ with str and object In-Reply-To: <4BFC3140.7090708@stsci.edu> References: <4BFC3140.7090708@stsci.edu> Message-ID: On Tue, May 25, 2010 at 2:21 PM, Michael Droettboom wrote: > Seems like a bug to me. Certain branches in _array_richcompare return > False to fail rather than Py_NotImplemented, which means the > string-understanding comparison fallbacks don't run. Attached is a (simple) > patch that resolves this bug, and doesn't seem to cause any of the unit > tests to fail. Does this make sense to someone with a better understanding > of the rich comparison code than I? > > Mike > > > On 05/25/2010 12:54 PM, Keith Goodman wrote: > >> a1 = np.array(['a', 'b'], dtype=object) >>>> >> a2 = np.array(['a', 'b']) >>>> >> >>>> >> a1 == a2 >>>> >>>> >>> array([ True, True], dtype=bool) # Looks good >> >> >>> >> a2 == a1 >>>> >>>> >>> False # Should I have expected this? >> >> >> > > > Could you open a ticket for this and mark it for review? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed May 26 09:15:08 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 07:15:08 -0600 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 2:59 AM, Pauli Virtanen wrote: > Wed, 26 May 2010 06:57:27 +0900, David Cournapeau wrote: > [clip: doxygen] > > It is yet another format to use inside C sources (I don't think doxygen > > supports rest), and I would rather have something that is similar, > > ideally integrated into sphinx. It also generates rather ugly doc by > > default, > > Anyway, we can probably nevertheless just agree on a readable plain-text/ > rst format, and then just use doxygen to generate the docs, as a band-aid. > > http://github.com/pv/numpycdoc > > Neat. I didn't quite see the how how you connected the rst documentation and doxygen. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Wed May 26 09:42:56 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 26 May 2010 06:42:56 -0700 Subject: [Numpy-discussion] __eq__ with str and object In-Reply-To: References: <4BFC3140.7090708@stsci.edu> Message-ID: On Wed, May 26, 2010 at 6:11 AM, Charles R Harris wrote: > > > On Tue, May 25, 2010 at 2:21 PM, Michael Droettboom wrote: >> >> Seems like a bug to me. ?Certain branches in _array_richcompare return >> False to fail rather than Py_NotImplemented, which means the >> string-understanding comparison fallbacks don't run. ?Attached is a (simple) >> patch that resolves this bug, and doesn't seem to cause any of the unit >> tests to fail. ?Does this make sense to someone with a better understanding >> of the rich comparison code than I? >> >> Mike >> >> On 05/25/2010 12:54 PM, Keith Goodman wrote: >>>>> >>>>> a1 = np.array(['a', 'b'], dtype=object) >>>>> >> ?a2 = np.array(['a', 'b']) >>>>> >> >>>>> >> ?a1 == a2 >>>>> >>> >>> ? ?array([ True, ?True], dtype=bool) ?# Looks good >>> >>>>> >>>>> >> ?a2 == a1 >>>>> >>> >>> ? ?False ?# Should I have expected this? >>> >>> >> >> > > Could you open a ticket for this and mark it for review? Here's the ticket: http://projects.scipy.org/numpy/ticket/1491 Mike, could you attach your fix? From arthurdeconihout at gmail.com Wed May 26 09:43:41 2010 From: arthurdeconihout at gmail.com (arthur de conihout) Date: Wed, 26 May 2010 15:43:41 +0200 Subject: [Numpy-discussion] Help Convolution with binaural filters(HRTFs) In-Reply-To: References: Message-ID: > > > > Hi, >> i try to implement a real-time convolution module refreshed by head >> listener location (angle from a reference point).The result of the >> convolution by binaural flters(HRTFs) allows me to spatialize a monophonic >> wavfile. I got trouble with this as long as my convolution doesnt seem to >> work properly: >> np.convolve() doesnt convolve the entire entry signal >> >> ->trouble with extracting numpyarrays from the audio wav. filters and >> monophonic entry >> ->trouble then with encaspulating the resulting array in a proper wav >> file...it is not read by audacity >> >> Do you have any idea of how this could work or do you have any >> implementation of stereo filtering by impulse response to submit me >> Thank you very much >> >> Arthur de Conihout >> >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed May 26 09:43:55 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 26 May 2010 21:43:55 +0800 Subject: [Numpy-discussion] FW: Numpy python build In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 11:25 AM, Padma TAN wrote: > Hi, > > Can I just install numpy and scipy without ATLAS? And what does this means > " > gnu: no Fortran 90 compiler found"? > Yes you can install without ATLAS. And BLAS and LAPACK were found so you should be fine. Did you have an actual problem or are you asking out of curiosity? Numpy does not contain any Fortran code, Scipy does. But that warning doesn't mean you have a problem - a number of compilers are checked and the first one that's found is used. Please also have a look at the RHEL section on http://www.scipy.org/Installing_SciPy/Linux. Cheers, Ralf > Im installing on RHEL > > Thanks in advance! > > [root at giswk002 numpy-1.3.0]# python setup.py build > Running from numpy source directory. > non-existing path in 'numpy/distutils': 'site.cfg' > F2PY Version 2 > blas_opt_info: > blas_mkl_info: > libraries mkl,vml,guide not found in /usr/local/Python-2.6.2/lib > libraries mkl,vml,guide not found in /usr/local/lib > libraries mkl,vml,guide not found in /usr/lib > NOT AVAILABLE > > atlas_blas_threads_info: > Setting PTATLAS=ATLAS > libraries ptf77blas,ptcblas,atlas not found in /usr/local/Python-2.6.2/lib > libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib > libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 > libraries ptf77blas,ptcblas,atlas not found in /usr/lib > NOT AVAILABLE > > atlas_blas_info: > libraries f77blas,cblas,atlas not found in /usr/local/Python-2.6.2/lib > libraries f77blas,cblas,atlas not found in /usr/local/lib > libraries f77blas,cblas,atlas not found in /usr/lib/sse2 > libraries f77blas,cblas,atlas not found in /usr/lib > NOT AVAILABLE > > /usr/local/numpy-1.3.0/numpy/distutils/system_info.py:1383: UserWarning: > Atlas (http://math-atlas.sourceforge.net/) libraries not found. > Directories to search for the libraries can be specified in the > numpy/distutils/site.cfg file (section [atlas]) or by setting > the ATLAS environment variable. > warnings.warn(AtlasNotFoundError.__doc__) > blas_info: > libraries blas not found in /usr/local/Python-2.6.2/lib > libraries blas not found in /usr/local/lib > FOUND: > libraries = ['blas'] > library_dirs = ['/usr/lib'] > language = f77 > > FOUND: > libraries = ['blas'] > library_dirs = ['/usr/lib'] > define_macros = [('NO_ATLAS_INFO', 1)] > language = f77 > > lapack_opt_info: > lapack_mkl_info: > mkl_info: > libraries mkl,vml,guide not found in /usr/local/Python-2.6.2/lib > libraries mkl,vml,guide not found in /usr/local/lib > libraries mkl,vml,guide not found in /usr/lib > NOT AVAILABLE > > NOT AVAILABLE > > atlas_threads_info: > Setting PTATLAS=ATLAS > libraries ptf77blas,ptcblas,atlas not found in /usr/local/Python-2.6.2/lib > libraries lapack_atlas not found in /usr/local/Python-2.6.2/lib > libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib > libraries lapack_atlas not found in /usr/local/lib > libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 > libraries lapack_atlas not found in /usr/lib/sse2 > libraries ptf77blas,ptcblas,atlas not found in /usr/lib > libraries lapack_atlas not found in /usr/lib > numpy.distutils.system_info.atlas_threads_info > NOT AVAILABLE > > atlas_info: > libraries f77blas,cblas,atlas not found in /usr/local/Python-2.6.2/lib > libraries lapack_atlas not found in /usr/local/Python-2.6.2/lib > libraries f77blas,cblas,atlas not found in /usr/local/lib > libraries lapack_atlas not found in /usr/local/lib > libraries f77blas,cblas,atlas not found in /usr/lib/sse2 > libraries lapack_atlas not found in /usr/lib/sse2 > libraries f77blas,cblas,atlas not found in /usr/lib > libraries lapack_atlas not found in /usr/lib > numpy.distutils.system_info.atlas_info > NOT AVAILABLE > > /usr/local/numpy-1.3.0/numpy/distutils/system_info.py:1290: UserWarning: > Atlas (http://math-atlas.sourceforge.net/) libraries not found. > Directories to search for the libraries can be specified in the > numpy/distutils/site.cfg file (section [atlas]) or by setting > the ATLAS environment variable. > warnings.warn(AtlasNotFoundError.__doc__) > lapack_info: > libraries lapack not found in /usr/local/Python-2.6.2/lib > libraries lapack not found in /usr/local/lib > FOUND: > libraries = ['lapack'] > library_dirs = ['/usr/lib'] > language = f77 > > FOUND: > libraries = ['lapack', 'blas'] > library_dirs = ['/usr/lib'] > define_macros = [('NO_ATLAS_INFO', 1)] > language = f77 > > running build > running config_cc > unifing config_cc, config, build_clib, build_ext, build commands --compiler > options > running config_fc > unifing config_fc, config, build_clib, build_ext, build commands > --fcompiler > options > running build_src > building py_modules sources > building library "npymath" sources > building extension "numpy.core._sort" sources > adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h' to > sources. > adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h' > to sources. > numpy/core/code_generators/genapi.py:9: DeprecationWarning: the md5 module > is deprecated; use hashlib instead > import md5 > executing numpy/core/code_generators/generate_numpy_api.py > adding > 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h' to > sources. > numpy.core - nothing done with h_files = > ['build/src.linux-i686-2.6/numpy/core/include/numpy/config.h', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h'] > building extension "numpy.core.multiarray" sources > adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h' to > sources. > adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h' > to sources. > executing numpy/core/code_generators/generate_numpy_api.py > adding > 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h' to > sources. > adding 'build/src.linux-i686-2.6/numpy/core/src' to include_dirs. > numpy.core - nothing done with h_files = > ['build/src.linux-i686-2.6/numpy/core/src/scalartypes.inc', > 'build/src.linux-i686-2.6/numpy/core/src/arraytypes.inc', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h'] > building extension "numpy.core.umath" sources > adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h' to > sources. > adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h' > to sources. > executing numpy/core/code_generators/generate_ufunc_api.py > adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/__ufunc_api.h' > to sources. > adding 'build/src.linux-i686-2.6/numpy/core/src' to include_dirs. > numpy.core - nothing done with h_files = > ['build/src.linux-i686-2.6/numpy/core/src/scalartypes.inc', > 'build/src.linux-i686-2.6/numpy/core/src/arraytypes.inc', > 'build/src.linux-i686-2.6/numpy/core/src/umath_funcs.inc', > 'build/src.linux-i686-2.6/numpy/core/src/umath_loops.inc', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/__ufunc_api.h'] > building extension "numpy.core.scalarmath" sources > adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/config.h' to > sources. > adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h' > to sources. > executing numpy/core/code_generators/generate_numpy_api.py > adding > 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h' to > sources. > executing numpy/core/code_generators/generate_ufunc_api.py > adding 'build/src.linux-i686-2.6/numpy/core/include/numpy/__ufunc_api.h' > to sources. > numpy.core - nothing done with h_files = > ['build/src.linux-i686-2.6/numpy/core/include/numpy/config.h', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/numpyconfig.h', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/__multiarray_api.h', > 'build/src.linux-i686-2.6/numpy/core/include/numpy/__ufunc_api.h'] > building extension "numpy.core._dotblas" sources > building extension "numpy.core.umath_tests" sources > building extension "numpy.lib._compiled_base" sources > building extension "numpy.numarray._capi" sources > building extension "numpy.fft.fftpack_lite" sources > building extension "numpy.linalg.lapack_lite" sources > adding 'numpy/linalg/lapack_litemodule.c' to sources. > adding 'numpy/linalg/python_xerbla.c' to sources. > building extension "numpy.random.mtrand" sources > /usr/local/numpy-1.3.0/numpy/distutils/command/config.py:39: > DeprecationWarning: > +++++++++++++++++++++++++++++++++++++++++++++++++ > Usage of try_run is deprecated: please do not > use it anymore, and avoid configuration checks > involving running executable on the target machine. > +++++++++++++++++++++++++++++++++++++++++++++++++ > > DeprecationWarning) > customize GnuFCompiler > Found executable /usr/bin/g77 > gnu: no Fortran 90 compiler found > gnu: no Fortran 90 compiler found > customize GnuFCompiler > gnu: no Fortran 90 compiler found > gnu: no Fortran 90 compiler found > customize GnuFCompiler using config > C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall > -Wstrict-prototypes -fPIC > > compile options: '-Inumpy/core/src -Inumpy/core/include > -I/usr/local/Python-2.6.2/include/python2.6 -c' > gcc: _configtest.c > gcc -pthread _configtest.o -o _configtest > _configtest > failure. > removing: _configtest.c _configtest.o _configtest > building data_files sources > running build_py > copying numpy/version.py -> build/lib.linux-i686-2.6/numpy > copying build/src.linux-i686-2.6/numpy/__config__.py -> > build/lib.linux-i686-2.6/numpy > copying build/src.linux-i686-2.6/numpy/distutils/__config__.py -> > build/lib.linux-i686-2.6/numpy/distutils > running build_clib > customize UnixCCompiler > customize UnixCCompiler using build_clib > running build_ext > customize UnixCCompiler > customize UnixCCompiler using build_ext > customize GnuFCompiler > gnu: no Fortran 90 compiler found > gnu: no Fortran 90 compiler found > customize GnuFCompiler > gnu: no Fortran 90 compiler found > gnu: no Fortran 90 compiler found > customize GnuFCompiler using build_ext > running scons > running build_scripts > adding 'build/scripts.linux-i686-2.6/f2py' to scripts > > ------ End of Forwarded Message > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsyu80 at gmail.com Wed May 26 09:59:17 2010 From: tsyu80 at gmail.com (Tony S Yu) Date: Wed, 26 May 2010 09:59:17 -0400 Subject: [Numpy-discussion] Bug in nanmin called with unsigned integers In-Reply-To: References: <60119391-07E5-4E79-9283-B1960E524B5F@gmail.com> Message-ID: <02130541-D982-43F9-BD33-B92EC34E63E1@gmail.com> On May 25, 2010, at 10:57 PM, Charles R Harris wrote: > > > On Tue, May 25, 2010 at 8:21 PM, Tony S Yu wrote: > I got bit again by this bug with unsigned integers. (My original changes got overwritten when I updated from svn and, unfortunately, merged conflicts without actually looking over the changes.) > > In any case, I thought it'd be a good time to bump the issue (with patch). > > Cheers, > -Tony > > PS: Just for context, this issue comes up when displaying images with Chaco (which converts images to unsigned integer arrays and calls nanmin). > > > Fixed in r8445. Please add some tests. I'm not totally sure what's appropriate to test, so I just added a simple test to the comments for the ticket. On a side note, I noticed that all the nan-ops degenerate to their non-nan-ops counterparts (i.e. nanmin --> min) when called with integer dtypes. Below is a diff where that's made a little more obvious by returning early for integer dtypes. Cheers, -Tony Index: numpy/lib/function_base.py =================================================================== --- numpy/lib/function_base.py (revision 8445) +++ numpy/lib/function_base.py (working copy) @@ -1295,15 +1295,15 @@ """ y = array(a, subok=True) - mask = isnan(a) # We only need to take care of NaN's in floating point arrays - if not np.issubdtype(y.dtype, np.integer): - # y[mask] = fill - # We can't use fancy indexing here as it'll mess w/ MaskedArrays - # Instead, let's fill the array directly... - np.putmask(y, mask, fill) - + if np.issubdtype(y.dtype, np.integer): + return op(y, axis=axis) + mask = isnan(a) + # y[mask] = fill + # We can't use fancy indexing here as it'll mess w/ MaskedArrays + # Instead, let's fill the array directly... + np.putmask(y, mask, fill) res = op(y, axis=axis) mask_all_along_axis = mask.all(axis=axis) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed May 26 10:14:23 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 26 May 2010 14:14:23 +0000 (UTC) Subject: [Numpy-discussion] Extending documentation to c code References: Message-ID: Wed, 26 May 2010 07:15:08 -0600, Charles R Harris wrote: > On Wed, May 26, 2010 at 2:59 AM, Pauli Virtanen wrote: > >> Wed, 26 May 2010 06:57:27 +0900, David Cournapeau wrote: [clip: >> doxygen] >> > It is yet another format to use inside C sources (I don't think >> > doxygen supports rest), and I would rather have something that is >> > similar, ideally integrated into sphinx. It also generates rather >> > ugly doc by default, >> >> Anyway, we can probably nevertheless just agree on a readable >> plain-text/ rst format, and then just use doxygen to generate the docs, >> as a band-aid. >> >> http://github.com/pv/numpycdoc > > Neat. I didn't quite see the how how you connected the rst documentation > and doxygen. I didn't :) But I just did: doing this it was actually a 10 min job since Doxygen accepts HTML -- now it parses the comments as RST and renders it properly as HTML in the Doxygen output. Of course getting links etc. to work would require more effort, but that's left as an exercise for someone else to finish. Pauli From chanley at stsci.edu Wed May 26 12:19:51 2010 From: chanley at stsci.edu (Christopher Hanley) Date: Wed, 26 May 2010 12:19:51 -0400 Subject: [Numpy-discussion] numpy and the Google App Engine Message-ID: Greetings, Google provides a product called App Engine. The description from their site follows, "Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast development and deployment; simple administration, with no need to worry about hardware, patches or backups; and effortless scalability. " You can deploy applications written in either Python or JAVA. There are free and paid versions of the service. The Google App Engine would appear to be a powerful source of CPU cycles for scientific computing. Unfortunately this is currently not the case because numpy is not one of the supported libraries. The Python App Engine allows only the installation of user supplied pure Python code. I have recently returned from attending the Google I/O conference in San Francisco. While there I inquired into the possibility of getting numpy added. The basic response was that there doesn't appear to be much interest from the community given the amount of work it would take to vet and add numpy. I would like to ask your help in changing this perception. The quickest and easiest thing you can do would be to add your "me too" to this feature request (item #190) on the support site: http://code.google.com/p/googleappengine/issues/detail?id=190 If this issue is important to you could also consider raising this issue in the related Google Group: http://groups.google.com/group/google-appengine Letting Google know how you will use numpy would be helpful. If you or your institution would be willing to pay for service if you could deploy cloud applications that required numpy would be helpful to let them know as well. Finally, if you run into any App Engine developers (Guido included) let them know that you would like to see numpy added. Thank you for your time and consideration. Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From bpederse at gmail.com Wed May 26 12:22:06 2010 From: bpederse at gmail.com (Brent Pedersen) Date: Wed, 26 May 2010 09:22:06 -0700 Subject: [Numpy-discussion] [Patch] Fix memmap pickling In-Reply-To: <20100524223750.GC32540@phare.normalesup.org> References: <20100524222537.GB32540@phare.normalesup.org> <20100524223750.GC32540@phare.normalesup.org> Message-ID: On Mon, May 24, 2010 at 3:37 PM, Gael Varoquaux wrote: > On Mon, May 24, 2010 at 03:33:09PM -0700, Brent Pedersen wrote: >> On Mon, May 24, 2010 at 3:25 PM, Gael Varoquaux >> wrote: >> > Memmapped arrays don't pickle right. I know that to get them to >> > really pickle and restore identically, we would need some effort. >> > However, in the current status, pickling and restoring a memmapped array >> > leads to tracebacks that seem like they could be avoided. > >> > I am attaching a patch with a test that shows the problem, and a fix. >> > Should I create a ticket, or is this light-enough to be applied >> > immediatly? > >> also check this: >> http://projects.scipy.org/numpy/ticket/1452 > >> still needs work. > > Does look good. Is there an ETA for your patch to be applied? > > Right now this bug is making code crash when memmapped arrays are used > (eg multiprocessing), so a hot fix can be useful, without removing any > merit to your work that addresses the underlying problem. > > Cheers, > > Ga?l > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > gael, not sure about ETA of application. i think the main remaining problem (other than more tests) is py3 support--as charris points out in the ticket. i have a start which shadows numpy's __getitem__, but havent fixed all the bugs--and not sure that's a good idea. my original patch was quite simple as well, but once it starts supporting all versions and more edge cases ... From robert.kern at gmail.com Wed May 26 12:54:17 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 26 May 2010 12:54:17 -0400 Subject: [Numpy-discussion] [SciPy-Dev] numpy and the Google App Engine In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 12:19, Christopher Hanley wrote: > Greetings, > > Google provides a product called App Engine. ?The description from > their site follows, > > "Google App Engine enables you to build and host web apps on the same > systems that power Google applications. > App Engine offers fast development and deployment; simple > administration, with no need to worry about hardware, > patches or backups; and effortless scalability. " > > You can deploy applications written in either Python or JAVA. ?There > are free and paid versions of the service. > > The Google App Engine would appear to be a powerful source of CPU > cycles for scientific computing. Not really. It is not intended for such purposes. It is intended for the easy deployment and horizontal scaling of web applications. Each individual request is very short; it is limited to 10 seconds of CPU time. While numpy would be useful for scientific web applications (not least because it would help you keep to that 10 second limit when doing things like simple image processing or summary statistics or whatever), it is not a source of CPU cycles. Services like Amazon EC2 or Rackspace Cloud are much closer to what you want. PiCloud provides an even nicer interface for you: http://www.picloud.com/ Disclosure: Enthought partners with PiCloud to provide most EPD libraries. I can't say I'm disinterested in promoting it, but it *is* a really powerful product that *does* provide CPU cycles for scientific computing with an interface much more suited to it than GAE. >?Unfortunately this is currently not > the case because numpy is not one of the supported libraries. ?The > Python App Engine allows only the installation of user supplied pure > Python code. > > I have recently returned from attending the Google I/O conference in > San Francisco. ?While there I inquired into the possibility of getting > numpy added. ?The basic response was that there doesn't appear to be > much interest from the community given the amount of work it would > take to vet and add numpy. > > I would like to ask your help in changing this perception. > > The quickest and easiest thing you can do would be to add your "me > too" to this feature request (item #190) on the support site: > > http://code.google.com/p/googleappengine/issues/detail?id=190 My understanding is that they hate "me too" comments. They ask that you star the issue instead. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dagss at student.matnat.uio.no Wed May 26 12:49:07 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 26 May 2010 18:49:07 +0200 Subject: [Numpy-discussion] numpy and the Google App Engine In-Reply-To: References: Message-ID: <4BFD5103.6030005@student.matnat.uio.no> Christopher Hanley wrote: > Greetings, > > Google provides a product called App Engine. The description from > their site follows, > > "Google App Engine enables you to build and host web apps on the same > systems that power Google applications. > App Engine offers fast development and deployment; simple > administration, with no need to worry about hardware, > patches or backups; and effortless scalability. " > > You can deploy applications written in either Python or JAVA. There > are free and paid versions of the service. > > The Google App Engine would appear to be a powerful source of CPU > cycles for scientific computing. Unfortunately this is currently not > the case because numpy is not one of the supported libraries. The > Python App Engine allows only the installation of user supplied pure > Python code. > > I have recently returned from attending the Google I/O conference in > San Francisco. While there I inquired into the possibility of getting > numpy added. The basic response was that there doesn't appear to be > much interest from the community given the amount of work it would > take to vet and add numpy. Something to keep in mind: It's rather trivial to write code to intentionally crash the Python interpreter using pure Python code and NumPy (or overwrite data in it, run custom assembly code...in short, NumPy is a big gaping security hole in this context). This obviously can't go on in the AppEngine. So this probably involves a considerable amount of work in the NumPy source code base as well, it's not simply about verifying. -- Dag Sverre From chanley at stsci.edu Wed May 26 13:32:50 2010 From: chanley at stsci.edu (Christopher Hanley) Date: Wed, 26 May 2010 13:32:50 -0400 Subject: [Numpy-discussion] [SciPy-Dev] numpy and the Google App Engine In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 12:54 PM, Robert Kern wrote: > On Wed, May 26, 2010 at 12:19, Christopher Hanley wrote: >> Greetings, >> >> Google provides a product called App Engine. ?The description from >> their site follows, >> >> "Google App Engine enables you to build and host web apps on the same >> systems that power Google applications. >> App Engine offers fast development and deployment; simple >> administration, with no need to worry about hardware, >> patches or backups; and effortless scalability. " >> >> You can deploy applications written in either Python or JAVA. ?There >> are free and paid versions of the service. >> >> The Google App Engine would appear to be a powerful source of CPU >> cycles for scientific computing. > > Not really. It is not intended for such purposes. It is intended for > the easy deployment and horizontal scaling of web applications. Each > individual request is very short; it is limited to 10 seconds of CPU > time. While numpy would be useful for scientific web applications (not > least because it would help you keep to that 10 second limit when > doing things like simple image processing or summary statistics or > whatever), it is not a source of CPU cycles. Services like Amazon EC2 > or Rackspace Cloud are much closer to what you want. PiCloud provides > an even nicer interface for you: > > ?http://www.picloud.com/ In my conversations with the developers they indicated that it could be used for both. However, either use case would be useful for scientific computing. > > Disclosure: Enthought partners with PiCloud to provide most EPD > libraries. I can't say I'm disinterested in promoting it, but it *is* > a really powerful product that *does* provide CPU cycles for > scientific computing with an interface much more suited to it than > GAE. > >>?Unfortunately this is currently not >> the case because numpy is not one of the supported libraries. ?The >> Python App Engine allows only the installation of user supplied pure >> Python code. >> >> I have recently returned from attending the Google I/O conference in >> San Francisco. ?While there I inquired into the possibility of getting >> numpy added. ?The basic response was that there doesn't appear to be >> much interest from the community given the amount of work it would >> take to vet and add numpy. >> >> I would like to ask your help in changing this perception. >> >> The quickest and easiest thing you can do would be to add your "me >> too" to this feature request (item #190) on the support site: >> >> http://code.google.com/p/googleappengine/issues/detail?id=190 > > My understanding is that they hate "me too" comments. They ask that > you star the issue instead. > I would be happy to see any support either starring or "me too" comments. Their comments to me was that they saw no interest. In my opinion any indication of interest would be a positive. > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From chanley at stsci.edu Wed May 26 13:37:07 2010 From: chanley at stsci.edu (Christopher Hanley) Date: Wed, 26 May 2010 13:37:07 -0400 Subject: [Numpy-discussion] numpy and the Google App Engine In-Reply-To: <4BFD5103.6030005@student.matnat.uio.no> References: <4BFD5103.6030005@student.matnat.uio.no> Message-ID: On Wed, May 26, 2010 at 12:49 PM, Dag Sverre Seljebotn wrote: > Christopher Hanley wrote: >> Greetings, >> >> Google provides a product called App Engine. ?The description from >> their site follows, >> >> "Google App Engine enables you to build and host web apps on the same >> systems that power Google applications. >> App Engine offers fast development and deployment; simple >> administration, with no need to worry about hardware, >> patches or backups; and effortless scalability. " >> >> You can deploy applications written in either Python or JAVA. ?There >> are free and paid versions of the service. >> >> The Google App Engine would appear to be a powerful source of CPU >> cycles for scientific computing. ?Unfortunately this is currently not >> the case because numpy is not one of the supported libraries. ?The >> Python App Engine allows only the installation of user supplied pure >> Python code. >> >> I have recently returned from attending the Google I/O conference in >> San Francisco. ?While there I inquired into the possibility of getting >> numpy added. ?The basic response was that there doesn't appear to be >> much interest from the community given the amount of work it would >> take to vet and add numpy. > > Something to keep in mind: It's rather trivial to write code to > intentionally crash the Python interpreter using pure Python code and > NumPy (or overwrite data in it, run custom assembly code...in short, > NumPy is a big gaping security hole in this context). This obviously > can't go on in the AppEngine. So this probably involves a considerable > amount of work in the NumPy source code base as well, it's not simply > about verifying. > Agreed. Perhaps the recently discussed rework of the C internals will better allow a security audit of numpy. At that point perhaps the numpy community could more easily work with Google to fix security problems. > -- > Dag Sverre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From d.l.goldsmith at gmail.com Wed May 26 13:57:58 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Wed, 26 May 2010 10:57:58 -0700 Subject: [Numpy-discussion] numpy and the Google App Engine In-Reply-To: References: <4BFD5103.6030005@student.matnat.uio.no> Message-ID: On Wed, May 26, 2010 at 10:37 AM, Christopher Hanley wrote: > On Wed, May 26, 2010 at 12:49 PM, Dag Sverre Seljebotn > wrote: > > Christopher Hanley wrote: > >> Greetings, > >> > >> Google provides a product called App Engine. The description from > >> their site follows, > >> > >> "Google App Engine enables you to build and host web apps on the same > >> systems that power Google applications. > >> App Engine offers fast development and deployment; simple > >> administration, with no need to worry about hardware, > >> patches or backups; and effortless scalability. " > >> > >> You can deploy applications written in either Python or JAVA. There > >> are free and paid versions of the service. > >> > >> The Google App Engine would appear to be a powerful source of CPU > >> cycles for scientific computing. Unfortunately this is currently not > >> the case because numpy is not one of the supported libraries. The > >> Python App Engine allows only the installation of user supplied pure > >> Python code. > >> > >> I have recently returned from attending the Google I/O conference in > >> San Francisco. While there I inquired into the possibility of getting > >> numpy added. The basic response was that there doesn't appear to be > >> much interest from the community given the amount of work it would > >> take to vet and add numpy. > > > > Something to keep in mind: It's rather trivial to write code to > > intentionally crash the Python interpreter using pure Python code and > > NumPy (or overwrite data in it, run custom assembly code...in short, > > NumPy is a big gaping security hole in this context). This obviously > > can't go on in the AppEngine. So this probably involves a considerable > > amount of work in the NumPy source code base as well, it's not simply > > about verifying. > > > > Agreed. Perhaps the recently discussed rework of the C internals will > better allow a security audit of numpy. My guess is that when "the fur begins to fly," submitted tickets will receive more attention, i.e., if you really want to see this done...file a ticket. (IMO, it's *never* wasted effort to do this: the worst that can happen is that some - recorded - person will close it as "will not do," and if for some unforeseeable reason they're unwilling to include an explanation as to why, well, you'll "know where they live," so to speak.) DG > At that point perhaps the > numpy community could more easily work with Google to fix security > problems. > > > > -- > > Dag Sverre > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > Christopher Hanley > Senior Systems Software Engineer > Space Telescope Science Institute > 3700 San Martin Drive > Baltimore MD, 21218 > (410) 338-4338 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves) -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Wed May 26 15:03:57 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 26 May 2010 12:03:57 -0700 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: 2010/5/25 St?fan van der Walt : > Awesome! Since github now supports SVN interaction, and all the core > devs use Git, now might be a good time to move the entire numpy source > tree? ?It will certainly make it easier to merge the refactor changes! I would love to move numpy to github as well. Almost everything I work on is there now and I am really enjoying using git and the github infrastructure is really nice. This is obviously a separate issue and one that shouldn't deflect the discussion on the proposed refactoring. But given how many of the developers are using git-svn and that you can use an svn client with github, it might be worth having a quick discussion about this in the near future. For instance, I wonder how many of the developer's prefer using git at this point. Also it would be interesting to hear from any of the developer's who would be opposed to git. A few year's ago this was a hot topic for discussion, but it may be that this isn't very controversial at this point. Jarrod From charlesr.harris at gmail.com Wed May 26 15:54:07 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 13:54:07 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 1:03 PM, Jarrod Millman wrote: > 2010/5/25 St?fan van der Walt : > > Awesome! Since github now supports SVN interaction, and all the core > > devs use Git, now might be a good time to move the entire numpy source > > tree? It will certainly make it easier to merge the refactor changes! > > I would love to move numpy to github as well. Almost everything I > work on is there now and I am really enjoying using git and the github > infrastructure is really nice. This is obviously a separate issue and > one that shouldn't deflect the discussion on the proposed refactoring. > But given how many of the developers are using git-svn and that you > can use an svn client with github, it might be worth having a quick > discussion about this in the near future. For instance, I wonder how > many of the developer's prefer using git at this point. Also it would > be interesting to hear from any of the developer's who would be > opposed to git. A few year's ago this was a hot topic for discussion, > but it may be that this isn't very controversial at this point. > > I think the main problem has been windows compatibility. Git is best from the command line whereas the windows command line is an afterthought. Another box that needs a check-mark is the buildbot. If svn clients are supported then it may be that neither of those are going to be a problem. However, It needs user testing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Wed May 26 16:10:40 2010 From: david.huard at gmail.com (David Huard) Date: Wed, 26 May 2010 16:10:40 -0400 Subject: [Numpy-discussion] Bug in frompyfunc starting at 10000 elements? In-Reply-To: References: Message-ID: And in 2.0.0.dev8437. More hints: Assume has shape (N, Da) and b has shape (N, Db) * There is a problem wben N >= 10000, Db=1 and Da > 1. * There is no problem when N >= 10000, Da=1 and Db > 1. * The first row is OK, but for all others, there is one error per row, appearing in first column, then last column, first, etc. Happy debugging ! David H. On Fri, May 21, 2010 at 9:22 PM, David Warde-Farley wrote: > Confirmed in NumPy 1.4.1, Py 2.6.5. > > David > > On Fri, 21 May 2010, James Bergstra wrote: > >> Hi all, I'm wondering if this is a bug... >> >> Something strange happens with my ufunc as soon as I use 10000 elements. As >> the test shows, the ufunc computes the correct result for either the first >> or last 9999 elements, but both at the same time is no good. >> >> Turns out I'm only running numpy 1.3.0 with Python 2.6.4... could someone >> with a more recent installation maybe check to see if this has been fixed? >> >> Thanks, >> >> def test_ufunc(): >> ? ?np = numpy >> >> ? ?rng = np.random.RandomState(2342) >> ? ?a = rng.randn(10000, 2) >> ? ?b = rng.randn(10000, 1) >> >> >> ? ?f = lambda x,y:x*y >> ? ?ufunc = np.frompyfunc(lambda *x:numpy.prod(x), 2, 1) >> >> ? ?def g(x,y): >> ? ? ? ?return np.asarray(ufunc(x,y), dtype='float64') >> >> >> ? ?assert numpy.allclose(f(a[:-1],b[:-1]), g(a[:-1],b[:-1])) >> ? # PASS >> ? ?assert numpy.allclose(f(a[1:],b[1:]), g(a[1:],b[1:])) ? ? ? ? ?# PASS >> ? ?assert numpy.allclose(f(a,b), g(a,b)) ? ? ? ? ? ? ? ? ? ? ? ? ? ? # FAIL >> >> >> -- >> http://www-etud.iro.umontreal.ca/~bergstrj >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Wed May 26 16:27:01 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 14:27:01 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 1:54 PM, Charles R Harris wrote: > > > On Wed, May 26, 2010 at 1:03 PM, Jarrod Millman wrote: > >> 2010/5/25 St?fan van der Walt : >> > Awesome! Since github now supports SVN interaction, and all the core >> > devs use Git, now might be a good time to move the entire numpy source >> > tree? It will certainly make it easier to merge the refactor changes! >> >> I would love to move numpy to github as well. Almost everything I >> work on is there now and I am really enjoying using git and the github >> infrastructure is really nice. This is obviously a separate issue and >> one that shouldn't deflect the discussion on the proposed refactoring. >> But given how many of the developers are using git-svn and that you >> can use an svn client with github, it might be worth having a quick >> discussion about this in the near future. For instance, I wonder how >> many of the developer's prefer using git at this point. Also it would >> be interesting to hear from any of the developer's who would be >> opposed to git. A few year's ago this was a hot topic for discussion, >> but it may be that this isn't very controversial at this point. >> >> > I think the main problem has been windows compatibility. Git is best from > the command line whereas the windows command line is an afterthought. > Another box that needs a check-mark is the buildbot. If svn clients are > supported then it may be that neither of those are going to be a problem. > However, It needs user testing. > > A newish git windows client I hadn't heard of before is gitextensions . Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Wed May 26 18:47:11 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 27 May 2010 07:47:11 +0900 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Thu, May 27, 2010 at 4:54 AM, Charles R Harris wrote: > > > On Wed, May 26, 2010 at 1:03 PM, Jarrod Millman > wrote: >> >> 2010/5/25 St?fan van der Walt : >> > Awesome! Since github now supports SVN interaction, and all the core >> > devs use Git, now might be a good time to move the entire numpy source >> > tree? ?It will certainly make it easier to merge the refactor changes! >> >> I would love to move numpy to github as well. ?Almost everything I >> work on is there now and I am really enjoying using git and the github >> infrastructure is really nice. ?This is obviously a separate issue and >> one that shouldn't deflect the discussion on the proposed refactoring. >> ?But given how many of the developers are using git-svn and that you >> can use an svn client with github, it might be worth having a quick >> discussion about this in the near future. ?For instance, I wonder how >> many of the developer's prefer using git at this point. ?Also it would >> be interesting to hear from any of the developer's who would be >> opposed to git. ?A few year's ago this was a hot topic for discussion, >> but it may be that this isn't very controversial at this point. >> > > I think the main problem has been windows compatibility. Git is best from > the command line whereas the windows command line is an afterthought. > Another box that needs a check-mark is the buildbot. If svn clients are > supported then it may be that neither of those are going to be a problem. > However, It needs user testing. As I mentioned in a previous post, there is smartgit, which is free for personal use, and is a graphical UI (does *not* depend on the mingw port of git, uses the reimplementation of git jgit in java used in google for android I believe). gitextensions is just a GUI around the mingw tools, and as such is less reliable. github also supports smart http for people behind proxies (although I don't know about the authentification issues if any). Trac and buildbot could use a svn mirror as provided by github for the time being, although there seems to be an issue with the numpy repo ATM (maybe my fault: http://support.github.com/discussions/repos/3155-svn-checkout-error-200-ok-error) David From millman at berkeley.edu Wed May 26 18:47:42 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 26 May 2010 15:47:42 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github Message-ID: Hello, I changed the subject line for this thread, since I didn't want to hijack another thread. Anyway, I am not proposing that we actually decide whether to move to git and github now, but I am just curious how people would feel. We had a conversation about this a few years ago and it was quite contentious at the time. Since then, I believe a number of us have started using git and github for most of our work. And there are a number of developers using git-svn to develop numpy now. So I was curious to get a feeling for what people would think about it, if we moved to git. (I don't want to rehash the arguments for the move.) Anyway, Chuck listed the main concerns we had previously when we discussed moving from svn to git. See the discussion below. Are there any other concerns? Am I right in thinking that most of the developers would prefer git at this point? Or are there still a number of developers who would prefer using svn still? On Wed, May 26, 2010 at 12:54 PM, Charles R Harris wrote: > I think the main problem has been windows compatibility. Git is best from > the command line whereas the windows command line is an afterthought. > Another box that needs a check-mark is the buildbot. If svn clients are > supported then it may be that neither of those are going to be a problem. I was under the impression that there were a number of decent git clients for Windows now, but I don't know anyone who develops on Windows. Are there any NumPy developers who use Windows who could check out the current situation? Pulling from github with an svn client works very well, so buildbot could continue working as is: http://github.com/blog/626-announcing-svn-support And if it turns out the Windows clients are still not good enough, we could look into the recently add svn write support to github: http://github.com/blog/644-subversion-write-support No need for us to make any changes immediately. I am just curious how people would feel about it at this point. Jarrod From oliphant at enthought.com Wed May 26 19:00:18 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Wed, 26 May 2010 18:00:18 -0500 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: <4135A273-9DE4-4CBB-AE88-DCED6D5E82E5@enthought.com> On May 26, 2010, at 3:50 AM, Sebastian Walter wrote: > I'm a potential user of the C-API and therefore I'm very interested in > the outcome. > In the previous discussion > (http://comments.gmane.org/gmane.comp.python.numeric.general/37409) > many different views on what the new C-API "should" be were expressed. > > Naturally, I wonder if the new C-API will be useful for my purposes. > So, I'm not so excited about a "refactoring log" where only the > progress is reported. I fear that some (potentially minor) design > decisions would render the new C-API useless for me. > > So my question is: > Does this "refactoring log" also include something like a Numpy > Enhancement Proposal? Something that can be discussed beforehand? > I.e., will there be a detailed description (i.e. code examples) what > the goal of the refactoring is? > > If there is any interest, I could provide some simple test examples in > C that would explain what I'd like to be able to do with the new > C-API. Our team would love to see this if possible. -Travis > > > Sebastian > > > > > On Wed, May 26, 2010 at 8:21 AM, David Goldsmith > wrote: >> On Tue, May 25, 2010 at 9:22 PM, Travis Oliphant > > >> wrote: >>> >>> On May 25, 2010, at 4:49 PM, David Goldsmith wrote: >>> >>> Travis: do you already have a place on the NumPy Development Wiki >>> where >>> you're (b)logging your design decisions? Seems like a good way for >>> concerned parties to monitor your choices in more or less real >>> time and thus >>> provide comment in a timely fashion. >>> >>> This is a great idea of course and we will definitely post progess >>> there. >>> >> >> Thanks; specific URL please, when available; plus, prominently >> feature (a >> link to) the location on the Development Wiki home page, at the >> very least >> (i.e., if not also on the NumPy home page). >> >>> >>> So far, the code has been reviewed, >> >> I.e., the existing code, yes? >> >>> >>> and several functions identified for re-factoring. >> >> Please enumerate on the "Wiki Refactoring Log" (name tentative - I >> don't >> care what we call it, just so long as it exists, its purpose is >> clear, and >> we all know where it is). >> >>> This is taking place in a github branch of numpy called numpy >>> refactor. >> >> "This" = the actual creation/modification of code, yes? >> >> DG >>> >>> -Travis >>> >>> DG >>> >>> On Tue, May 25, 2010 at 2:19 PM, Charles R Harris >>> wrote: >>>> >>>> >>>> On Tue, May 25, 2010 at 2:54 PM, Travis Oliphant >>> > >>>> wrote: >>>>> >>>>> On May 25, 2010, at 2:50 PM, Charles R Harris wrote: >>>>> >>>>> >>>>> On Tue, May 25, 2010 at 1:37 PM, Travis Oliphant >>>>> wrote: >>>>>> >>>>>> Hi everyone, >>>>>> >>>>>> There has been some talk about re-factoring NumPy to separate >>>>>> out the >>>>>> Python C-API layer and make NumPy closer to a C-library. I know >>>>>> there are a few different ideas about what this means, and also >>>>>> that >>>>>> people are very busy. I also know there is a NumPy 2.0 release >>>>>> that >>>>>> is in the works. >>>>>> >>>>>> I'm excited to let everyone know that we (at Enthought) have >>>>>> been able >>>>>> to find resources (about 3 man months) to work on this re- >>>>>> factoring >>>>>> project and Scott and Jason (both very experienced C and Python >>>>>> programmers) are actively pursuing it. My hope is that NumPy >>>>>> 2.0 >>>>>> will contain this re-factoring (which should be finished just >>>>>> after >>>>>> SciPy 2010 --- where I'm going to organize a Sprint on NumPy >>>>>> which >>>>>> will include at least date-time improvements and re-factoring >>>>>> work). >>>>>> >>>>>> While we have specific goals for the re-factoring, we want this >>>>>> activity to be fully integrated with the NumPy community and >>>>>> Scott and >>>>>> Jason want to interact with the community as much as feasible >>>>>> as they >>>>>> suggest re-factoring changes (though they both have more >>>>>> experience >>>>>> with phone-conversations to resolve concerns than email chains >>>>>> and so >>>>>> some patience from everybody will be appreciated). >>>>>> >>>>>> Because Jason and Scott are new to this mailing list (but not >>>>>> new to >>>>>> NumPy), I wanted to introduce them so they would feel more >>>>>> comfortable posting questions and people would have some >>>>>> context as to >>>>>> what they were trying to do. >>>>>> >>>>>> Scott and Jason are both very proficient and skilled >>>>>> programmers and I >>>>>> have full confidence in their abilities. That said, we very >>>>>> much >>>>>> want the input of as many people as possible as we pursue the >>>>>> goal of >>>>>> grouping together more tightly the Python C-API interface layer >>>>>> to >>>>>> NumPy. >>>>>> >>>>>> I will be involved in some of the discussions, but am currently >>>>>> on a >>>>>> different project which has tight schedules and so I will only >>>>>> be able >>>>>> to provide limited "mailing-list" visibility. >>>>>> >>>>> >>>>> I think 2.0 would be a bit early for this. Is there any reason it >>>>> couldn't be done in 2.1? What is the planned policy with regards >>>>> to the >>>>> visible interface for extensions? It would also be nice to have >>>>> a rough idea >>>>> of how the resulting code would be layered, i.e., what is the >>>>> design for >>>>> this re-factoring. Simply having a design would be a major step >>>>> forward. >>>>> >>>>> The problem with doing it in 2.1 is that this re-factoring will >>>>> require >>>>> extensions to be re-built. The visible interface to extensions >>>>> will not >>>>> change, but there will likely be ABI incompatibility. It >>>>> seems prudent to >>>>> do this in NumPy 2.0. Perhaps we can also put in place the ABI- >>>>> protecting >>>>> indirection approaches that David C. was suggesting earlier. >>>>> Some aspects of the design are still being fleshed out, but the >>>>> basic >>>>> idea is to separate out a core library that is as independent of >>>>> the Python >>>>> C-API as possible. There will likely be at least some >>>>> dependency on the >>>>> Python C-API (reference counting and error handling and possibly >>>>> others) >>>>> which any interface would have to provide in a very simple >>>>> Python.h -- >>>>> equivalent, for example. >>>>> Our purpose is to allow NumPy to be integrated with other >>>>> languages or >>>>> other frameworks systems without explicitly relying on >>>>> CPython. There are >>>>> a lot of questions as to how this will work, and so much of that >>>>> is being >>>>> worked out. Part of the reason for this mail is to help ensure >>>>> that as >>>>> much of this discussion as possible takes place in public. >>>> >>>> Sounds good, but what if it doesn't get finished in a few months? >>>> I think >>>> we should get 2.0.0 out pronto, ideally it would already have >>>> been released. >>>> I think a major refactoring like this proposal should get the >>>> 3.0.0 label. >>>> Admittedly that makes keeping a refactored branch current with >>>> fixes going >>>> into the trunk a hassle, but perhaps that can be worked around >>>> somewhat by >>>> clearly labeling what files will be touched in the refactoring >>>> and possibly >>>> rearranging the content of the existing files. This requires a >>>> game plan and >>>> a clear idea of the goal. Put simply, I think the proposed >>>> schedule is too >>>> ambitious and needs to be fleshed out. This refactoring isn't >>>> going to be >>>> as straight forward as the python3k port because a lot of design >>>> decisions >>>> need to be made along the way. >>>> >>>> Chuck >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> >>> -- >>> Mathematician: noun, someone who disavows certainty when their >>> uncertainty >>> set is non-empty, even if that set has measure zero. >>> >>> Hope: noun, that delusive spirit which escaped Pandora's jar and, >>> with her >>> lies, prevents mankind from committing a general suicide. (As >>> interpreted >>> by Robert Graves) >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> --- >>> Travis Oliphant >>> Enthought, Inc. >>> oliphant at enthought.com >>> 1-512-536-1057 >>> http://www.enthought.com >>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> >> -- >> Mathematician: noun, someone who disavows certainty when their >> uncertainty >> set is non-empty, even if that set has measure zero. >> >> Hope: noun, that delusive spirit which escaped Pandora's jar and, >> with her >> lies, prevents mankind from committing a general suicide. (As >> interpreted >> by Robert Graves) >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliphant at enthought.com From oliphant at enthought.com Wed May 26 19:05:26 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Wed, 26 May 2010 18:05:26 -0500 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: <2823234B-7058-48DF-A18D-1B544D352958@enthought.com> On May 26, 2010, at 5:31 AM, Pauli Virtanen wrote: > Wed, 26 May 2010 10:50:19 +0200, Sebastian Walter wrote: >> I'm a potential user of the C-API and therefore I'm very interested >> in >> the outcome. >> In the previous discussion >> (http://comments.gmane.org/gmane.comp.python.numeric.general/37409) >> many >> different views on what the new C-API "should" be were expressed. > > I believe the aim of the refactor is to *not* change the C-API at all, > but separate it internally from the routines that do the heavy > lifting. > Externally, Numpy would still look the same, but be more easy to > maintain. > > The routines that do the heavy lifting could then be good for reuse > and > be more easy to maintain, but I think how and where they would be > exposed > hasn't been discussed so far... > This is correct. In our plans, the current NumPy C-API would not change. Clearly there will be another potential API that could be used by other systems, but exactly what this is should be discussed and decided upon over time. I don't think our goals for the re-factoring depends on how this is done exactly. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Wed May 26 19:10:23 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Wed, 26 May 2010 18:10:23 -0500 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: <485E20B5-C7F7-4939-AEFA-28F6B8BB0502@enthought.com> On May 26, 2010, at 6:40 AM, Sebastian Walter wrote: > On Wed, May 26, 2010 at 12:31 PM, Pauli Virtanen wrote: >> Wed, 26 May 2010 10:50:19 +0200, Sebastian Walter wrote: >>> I'm a potential user of the C-API and therefore I'm very >>> interested in >>> the outcome. >>> In the previous discussion >>> (http://comments.gmane.org/gmane.comp.python.numeric.general/ >>> 37409) many >>> different views on what the new C-API "should" be were expressed. >> >> I believe the aim of the refactor is to *not* change the C-API at >> all, >> but separate it internally from the routines that do the heavy >> lifting. >> Externally, Numpy would still look the same, but be more easy to >> maintain. > > Sorry for the confusion. By C-API I meant a C-API that would be > independent of the CPython API. > >> >> The routines that do the heavy lifting could then be good for reuse >> and >> be more easy to maintain, but I think how and where they would be >> exposed >> hasn't been discussed so far... > > I had the impression that the goal is not only to have code that is > easier to maintain but to give developers the possibility to use numpy > functionality (broadcasting, ufuncs, ...) within C code without having > to use CPython API (refcounts, construction of PyObjects etc.). This is partially correct. There may be a need to have some "stub" implementation of some aspects of the Python C-API (probably at least reference counting and exception handling for now). We don't need to work all of this out initially. I think getting the separation done in the next several weeks will spawn conversations and ideas that may take several months to work out the new C-level-only API. That "interface" API is important in the short term, but also one that could change over the next several months. -Travis From oliphant at enthought.com Wed May 26 19:12:21 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Wed, 26 May 2010 18:12:21 -0500 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On May 26, 2010, at 5:47 PM, Jarrod Millman wrote: > Hello, > > I changed the subject line for this thread, since I didn't want to > hijack another thread. Anyway, I am not proposing that we actually > decide whether to move to git and github now, but I am just curious > how people would feel. We had a conversation about this a few years > ago and it was quite contentious at the time. Since then, I believe a > number of us have started using git and github for most of our work. > And there are a number of developers using git-svn to develop numpy > now. So I was curious to get a feeling for what people would think > about it, if we moved to git. (I don't want to rehash the arguments > for the move.) > I think we are ready for such a move. Someone should think about the implications, though (with Trac integration, check-in mailings, etc.) and make sure we get something we all like. Somebody probably has thought through all of these things already, though. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Wed May 26 19:20:02 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 26 May 2010 16:20:02 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 4:12 PM, Travis Oliphant wrote: > I think we are ready for such a move. ? ?Someone should think about the > implications, though (with Trac integration, check-in mailings, etc.) and > make sure we get something we all like. ? Somebody probably has thought > through all of these things already, though. Cool. At this point, I am just testing the water. If enough people seem to be OK with the idea in general, I can spend some time looking into the details more closely. Before we make an actual decision, it would be worth turning this into an actual NEP and then asking people to review it. Jarrod From matthew.brett at gmail.com Wed May 26 19:44:08 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 May 2010 16:44:08 -0700 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: Hi, > I think the main problem has been windows compatibility. Git is best from > the command line whereas the windows command line is an afterthought. > Another box that needs a check-mark is the buildbot. If svn clients are > supported then it may be that neither of those are going to be a problem. > However, It needs user testing. For windows - I think honestly this is now not a serious barrier to using git. I've installed msysgit on 4 or 5 machines recently, and it has been very smooth - as well as providing a nice bash shell. I think it would be a huge reduction in the barrier to contributing to numpy if we could change to git. Cheers, Matthew From stefan at sun.ac.za Wed May 26 20:11:39 2010 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 26 May 2010 17:11:39 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On 26 May 2010 16:12, Travis Oliphant wrote: > I changed the subject line for this thread, since I didn't want to > hijack another thread. ?Anyway, I am not proposing that we actually > decide whether to move to git and github now, but I am just curious > how people would feel. ?We had a conversation about this a few years > ago and it was quite contentious at the time. ?Since then, I believe a > number of us have started using git and github for most of our work. > And there are a number of developers using git-svn to develop numpy > now. ?So I was curious to get a feeling for what people would think > about it, if we moved to git. ?(I don't want to rehash the arguments > for the move.) > > I think we are ready for such a move. ? ?Someone should think about the > implications, though (with Trac integration, check-in mailings, etc.) and > make sure we get something we all like. ? Somebody probably has thought > through all of these things already, though. Awesome, if there's enough interest I'll help Jarrod out on the NEP. I've been looking at GitHub's Trac integration, and it seems that we should be able to have the same level of integration with the bugtracker as we currently do. Their plugin is available here: http://github.com/davglass/github-trac/ The SVN-checkout functionality should take care of the build bot. As a bonus, we no longer have to administrate user accounts. Converting the SVN repo to Git should pose no problem. Regards St?fan From porterj at alum.rit.edu Wed May 26 21:16:13 2010 From: porterj at alum.rit.edu (James Porter) Date: Wed, 26 May 2010 20:16:13 -0500 Subject: [Numpy-discussion] zeros_like and friends shouldn't use ndarray.__new__(type(a), ...) In-Reply-To: <4BEC8C73.3050106@alum.rit.edu> References: <4BEC8C73.3050106@alum.rit.edu> Message-ID: Ping? On 5/13/2010 6:34 PM, Jim Porter wrote: > Ok, let's try sending this message again, since it looks like I can't > send from gmane... > > (See discussion on python-list at > http://permalink.gmane.org/gmane.comp.python.general/661328 for context) > > numpy.zeros_like contains the following code: > > def zeros_like(a): > if isinstance(a, ndarray): > res = ndarray.__new__(type(a), a.shape, a.dtype, > order=a.flags.fnc) > res.fill(0) > return res > ... > > This is a problem because basetype.__new__(subtype, ...) raises an > exception when subtype is defined from C (specifically, when > Py_TPFLAGS_HEAPTYPE is not set). There's a check in Objects/typeobject.c > in tp_new_wrapper that disallows this (you can grep for "is not safe" to > find there the exception is raised). > > The end result is that it's impossible to use zeros_like, ones_like or > empty_like with ndarray subtypes defined in C. > > While I'm still not sure why Python needs this check in general, Robert > Kern pointed out that the problem can be fixed pretty easily in NumPy by > changing zeros_like and friends to something like this (with some > modifications from me): > > def zeros_like(a): > if isinstance(a, ndarray): > res = numpy.zeros(a.shape, a.dtype, order=a.flags.fnc) > res = res.view(type(a)) > res.__array_finalize__(a) > return res > ... > > - Jim From josef.pktd at gmail.com Wed May 26 21:25:02 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 26 May 2010 21:25:02 -0400 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 7:44 PM, Matthew Brett wrote: > Hi, > >> I think the main problem has been windows compatibility. Git is best from >> the command line whereas the windows command line is an afterthought. >> Another box that needs a check-mark is the buildbot. If svn clients are >> supported then it may be that neither of those are going to be a problem. >> However, It needs user testing. > > For windows - I think honestly this is now not a serious barrier to > using git. ?I've installed msysgit on 4 or 5 machines recently, and it > has been very smooth - as well as providing a nice bash shell. there is no such thing as a "nice" bash shell for a windows user. I have no idea how to use one. Josef > > I think it would be a huge reduction in the barrier to contributing to > numpy if we could change to git. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From matthew.brett at gmail.com Wed May 26 21:34:37 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 May 2010 18:34:37 -0700 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: Hi, > there is no such thing as a "nice" bash shell for a windows user. > I have no idea how to use one. It is a nice bash shell. You may not want a nice bash shell ;) I can't imagine you'd object to one though. It's just a useful place to type git commands, with file / directory path autocompletion, git branch autocompletion and so on. See you, Matthew From charlesr.harris at gmail.com Wed May 26 21:35:28 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 19:35:28 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 7:25 PM, wrote: > On Wed, May 26, 2010 at 7:44 PM, Matthew Brett > wrote: > > Hi, > > > >> I think the main problem has been windows compatibility. Git is best > from > >> the command line whereas the windows command line is an afterthought. > >> Another box that needs a check-mark is the buildbot. If svn clients are > >> supported then it may be that neither of those are going to be a > problem. > >> However, It needs user testing. > > > > For windows - I think honestly this is now not a serious barrier to > > using git. I've installed msysgit on 4 or 5 machines recently, and it > > has been very smooth - as well as providing a nice bash shell. > > there is no such thing as a "nice" bash shell for a windows user. > I have no idea how to use one. > > Heh. Can you to try the svn interface to github using your favorite svn ap. I suppose we need to set up a test account there. Is it possible to have a multiple user git account on github, or is it all push and merge? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed May 26 21:37:16 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 19:37:16 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 7:34 PM, Matthew Brett wrote: > Hi, > > > there is no such thing as a "nice" bash shell for a windows user. > > I have no idea how to use one. > > It is a nice bash shell. You may not want a nice bash shell ;) > > I can't imagine you'd object to one though. It's just a useful place > to type git commands, with file / directory path autocompletion, git > branch autocompletion and so on. > > Any shell on windows is a pain, if only because of the spaces in the filenames. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Wed May 26 21:37:16 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 26 May 2010 20:37:16 -0500 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: 2010/5/26 St?fan van der Walt : > On 26 May 2010 16:12, Travis Oliphant wrote: >> I changed the subject line for this thread, since I didn't want to >> hijack another thread. ?Anyway, I am not proposing that we actually >> decide whether to move to git and github now, but I am just curious >> how people would feel. ?We had a conversation about this a few years >> ago and it was quite contentious at the time. ?Since then, I believe a >> number of us have started using git and github for most of our work. >> And there are a number of developers using git-svn to develop numpy >> now. ?So I was curious to get a feeling for what people would think >> about it, if we moved to git. ?(I don't want to rehash the arguments >> for the move.) >> >> I think we are ready for such a move. ? ?Someone should think about the >> implications, though (with Trac integration, check-in mailings, etc.) and >> make sure we get something we all like. ? Somebody probably has thought >> through all of these things already, though. > > Awesome, if there's enough interest I'll help Jarrod out on the NEP. > I've been looking at GitHub's Trac integration, and it seems that we > should be able to have the same level of integration with the > bugtracker as we currently do. ?Their plugin is available here: > > http://github.com/davglass/github-trac/ > > The SVN-checkout functionality should take care of the build bot. ?As > a bonus, we no longer have to administrate user accounts. ?Converting > the SVN repo to Git should pose no problem. > > Regards > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > You are all probably aware of this, but I just wanted it said. I do understand the advantage of being able to pull from someone's Python 3 branch (like scipy) as well as some of the more experimental side like the proposed refactoring. All that I ask is that there is one official place to do 'git clone' or 'git pull' from a single official branch. I do not think that it is good to tell users to pull from different branches especially if these branches have conflicts. It also provides a common foundation to troubleshoot problems (of course you don't see it because you don't have that branch...). Yet I do understand that any release candidate can be pulled from any tree (as happens with the Linux kernel) and that this should be more of guide than a fixed rule. Bruce From ben.root at ou.edu Wed May 26 21:47:52 2010 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 26 May 2010 20:47:52 -0500 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 8:37 PM, Charles R Harris wrote: > > > On Wed, May 26, 2010 at 7:34 PM, Matthew Brett wrote: > >> Hi, >> >> > there is no such thing as a "nice" bash shell for a windows user. >> > I have no idea how to use one. >> >> It is a nice bash shell. You may not want a nice bash shell ;) >> >> I can't imagine you'd object to one though. It's just a useful place >> to type git commands, with file / directory path autocompletion, git >> branch autocompletion and so on. >> >> > Any shell on windows is a pain, if only because of the spaces in the > filenames. > Why? Can't you just escape the spaces with backslashes... oh, nevermind... Ben Root > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Wed May 26 21:56:26 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 27 May 2010 10:56:26 +0900 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 10:37 AM, Bruce Southey wrote: > 2010/5/26 St?fan van der Walt : >> On 26 May 2010 16:12, Travis Oliphant wrote: >>> I changed the subject line for this thread, since I didn't want to >>> hijack another thread. ?Anyway, I am not proposing that we actually >>> decide whether to move to git and github now, but I am just curious >>> how people would feel. ?We had a conversation about this a few years >>> ago and it was quite contentious at the time. ?Since then, I believe a >>> number of us have started using git and github for most of our work. >>> And there are a number of developers using git-svn to develop numpy >>> now. ?So I was curious to get a feeling for what people would think >>> about it, if we moved to git. ?(I don't want to rehash the arguments >>> for the move.) >>> >>> I think we are ready for such a move. ? ?Someone should think about the >>> implications, though (with Trac integration, check-in mailings, etc.) and >>> make sure we get something we all like. ? Somebody probably has thought >>> through all of these things already, though. >> >> Awesome, if there's enough interest I'll help Jarrod out on the NEP. >> I've been looking at GitHub's Trac integration, and it seems that we >> should be able to have the same level of integration with the >> bugtracker as we currently do. ?Their plugin is available here: >> >> http://github.com/davglass/github-trac/ >> >> The SVN-checkout functionality should take care of the build bot. ?As >> a bonus, we no longer have to administrate user accounts. ?Converting >> the SVN repo to Git should pose no problem. >> >> Regards >> St?fan >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > You are all probably aware of this, but I just wanted it said. I do > understand the advantage of being able to pull from someone's Python 3 > branch (like scipy) as well as some of the more experimental side like > the proposed refactoring. There could (and should) be a github repo on scipy.org. This would be used as the "reference". Something that needs being discussed on is how people will work together - going to "fulltime" git means a change in how to interact compared to git-svn (no more rebase to make changes visible, etc...). I am wondering whether we should follow the pull model - maybe through a gateway, I am not sure: http://www.selenic.com/pipermail/mercurial/2008-July/020116.html > It also provides a common foundation to > troubleshoot problems (of course you don't see it because you don't > have that branch...). Yet I do understand that any release candidate > can be pulled from any tree (as happens with the Linux kernel) and > that this should be more of guide than a fixed rule. The whole point of DVCS is that it is trivial to set up an official repo where the releases are done from, without preventing people to work as they see fit. cheers, David From aarchiba at physics.mcgill.ca Wed May 26 22:19:27 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Wed, 26 May 2010 23:19:27 -0300 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: Hi Jarrod, I'm in favour of the switch, though I don't use Windows. I find git far more convenient to use than SVN; I've been using git-svn, and in spite of the headaches it's caused me I still prefer it to raw SVN. It seems to me that git's flexibility in how people collaborate means we can do a certain amount of figuring out after the switch. My experience with a small project has been that anyone who wants to make major changes just clones the repository on github and makes the changes; then we email the main author to ask him to pull particular branches into the main repo. It works well enough. Anne On 26 May 2010 19:47, Jarrod Millman wrote: > Hello, > > I changed the subject line for this thread, since I didn't want to > hijack another thread. ?Anyway, I am not proposing that we actually > decide whether to move to git and github now, but I am just curious > how people would feel. ?We had a conversation about this a few years > ago and it was quite contentious at the time. ?Since then, I believe a > number of us have started using git and github for most of our work. > And there are a number of developers using git-svn to develop numpy > now. ?So I was curious to get a feeling for what people would think > about it, if we moved to git. ?(I don't want to rehash the arguments > for the move.) > > Anyway, Chuck listed the main concerns we had previously when we > discussed moving from svn to git. ?See the discussion below. ?Are > there any other concerns? ?Am I right in thinking that most of the > developers would prefer git at this point? ?Or are there still a > number of developers who would prefer using svn still? > > On Wed, May 26, 2010 at 12:54 PM, Charles R Harris > wrote: >> I think the main problem has been windows compatibility. Git is best from >> the command line whereas the windows command line is an afterthought. >> Another box that needs a check-mark is the buildbot. If svn clients are >> supported then it may be that neither of those are going to be a problem. > > I was under the impression that there were a number of decent git > clients for Windows now, but I don't know anyone who develops on > Windows. ?Are there any NumPy developers who use Windows who could > check out the current situation? > > Pulling from github with an svn client works very well, so buildbot > could continue working as is: > http://github.com/blog/626-announcing-svn-support > > And if it turns out the Windows clients are still not good enough, we > could look into the recently add svn write support to github: > http://github.com/blog/644-subversion-write-support > > No need for us to make any changes immediately. ?I am just curious how > people would feel about it at this point. > > Jarrod > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ben.root at ou.edu Wed May 26 22:38:29 2010 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 26 May 2010 21:38:29 -0500 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: I wouldn't call myself a developer, but I have been wanting to contribute recently. I learned source control with svn, so I am much more comfortable with it. My one attempt at using git for a personal project ended in failure. Then I discovered this guide, "Git-SVN Crash Course": http://git.or.cz/course/svn.html I hope this would be useful to other subversioners like me who might be hesistant to switch to git. Ben Root On Wed, May 26, 2010 at 5:47 PM, Jarrod Millman wrote: > Hello, > > I changed the subject line for this thread, since I didn't want to > hijack another thread. Anyway, I am not proposing that we actually > decide whether to move to git and github now, but I am just curious > how people would feel. We had a conversation about this a few years > ago and it was quite contentious at the time. Since then, I believe a > number of us have started using git and github for most of our work. > And there are a number of developers using git-svn to develop numpy > now. So I was curious to get a feeling for what people would think > about it, if we moved to git. (I don't want to rehash the arguments > for the move.) > > Anyway, Chuck listed the main concerns we had previously when we > discussed moving from svn to git. See the discussion below. Are > there any other concerns? Am I right in thinking that most of the > developers would prefer git at this point? Or are there still a > number of developers who would prefer using svn still? > > On Wed, May 26, 2010 at 12:54 PM, Charles R Harris > wrote: > > I think the main problem has been windows compatibility. Git is best from > > the command line whereas the windows command line is an afterthought. > > Another box that needs a check-mark is the buildbot. If svn clients are > > supported then it may be that neither of those are going to be a problem. > > I was under the impression that there were a number of decent git > clients for Windows now, but I don't know anyone who develops on > Windows. Are there any NumPy developers who use Windows who could > check out the current situation? > > Pulling from github with an svn client works very well, so buildbot > could continue working as is: > http://github.com/blog/626-announcing-svn-support > > And if it turns out the Windows clients are still not good enough, we > could look into the recently add svn write support to github: > http://github.com/blog/644-subversion-write-support > > No need for us to make any changes immediately. I am just curious how > people would feel about it at this point. > > Jarrod > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed May 26 22:46:29 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 26 May 2010 22:46:29 -0400 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 9:37 PM, Charles R Harris wrote: > > > On Wed, May 26, 2010 at 7:34 PM, Matthew Brett > wrote: >> >> Hi, >> >> > there is no such thing as a "nice" bash shell for a windows user. >> > I have no idea how to use one. >> >> It is a nice bash shell. ? You may not want a nice bash shell ;) >> >> I can't imagine you'd object to one though. ?It's just a useful place >> to type git commands, with file / directory path autocompletion, git >> branch autocompletion and so on. >> > > Any shell on windows is a pain, if only because of the spaces in the > filenames. When I'm using git, or bzr or svn I use the windows shell which I'm very familiar with ,and allows "standard" copy-paste and has quotes. But since, I think, there are no numpy developers on Windows, and I'm the only one for scipy, and occasional commits I can do with anything, I won't argue this time. Josef > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From matthew.brett at gmail.com Wed May 26 22:57:49 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 May 2010 19:57:49 -0700 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: Hi, >> Any shell on windows is a pain, if only because of the spaces in the >> filenames. > > When I'm using git, or bzr or svn I use the windows shell which I'm > very familiar with ,and allows "standard" copy-paste and has quotes. > > But since, I think, there are no numpy developers on Windows, and I'm > the only one for scipy, and occasional commits I can do with anything, > I won't argue this time. I've been testing quite a bit on windows recently, and I used to use windows all the time. I've found msysgit to be pretty good. I personally have always hated the windows shell, but you can use msysgit from the windows shell if you prefer... If you're OK with bzr or svn from the windows command line, I am sure git will pose no major problems. See you, Matthew From matthew.brett at gmail.com Wed May 26 23:08:21 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 May 2010 20:08:21 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: Hi, > It seems to me that git's flexibility in how people collaborate means > we can do a certain amount of figuring out after the switch. This is very well said and true to our recent experience with nipy and ipython: http://github.com/ipython/ipython http://github.com/nipy/nipy > My > experience with a small project has been that anyone who wants to make > major changes just clones the repository on github and makes the > changes; then we email the main author to ask him to pull particular > branches into the main repo. It works well enough. That's the model we've gone for in nipy and ipython too. We wrote it up in a workflow doc project. Here are the example docs giving the git workflow for ipython: https://cirl.berkeley.edu/mb312/gitwash/ and in particular: https://cirl.berkeley.edu/mb312/gitwash/development_workflow.html Cheers, Matthew From millman at berkeley.edu Wed May 26 23:43:46 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 26 May 2010 20:43:46 -0700 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 6:35 PM, Charles R Harris wrote: > Heh. Can you to try the svn interface to github using your favorite svn ap. > I suppose we need to set up a test account there. Is it possible to have a > multiple user git account on github, or is it all push and merge? Yes, that is possible. Currently it is not officially allowed according to their terms of service, but they are planning to enable that and have granted exceptions to their current policy in a few instances (that I am familiar with). But this is a detail that we can easily address in a NEP, if there seems to be interest (which there currently seems to be). So for now just keep sending emails with issues you want addressed in any actual proposal for this transition. Let's move this discussion to the git thread and use the thread for the discussion related directly to the subject line. Thanks, -- Jarrod Millman Helen Wills Neuroscience Institute 10 Giannini Hall, UC Berkeley http://cirl.berkeley.edu/ From millman at berkeley.edu Wed May 26 23:45:46 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 26 May 2010 20:45:46 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 6:37 PM, Bruce Southey wrote: > All that I ask is that there is one official place to do 'git clone' > or 'git pull' from a single official branch. I do not think that it is > good to tell users to pull from different branches especially if these > branches have conflicts. It also provides a common foundation to > troubleshoot problems (of course you don't see it because you don't > have that branch...). Yet I do understand that any release candidate > can be pulled from any tree (as happens with the Linux kernel) and > that this should be more of guide than a fixed rule. That seems to be a very reasonable request and one that I agree with. If we do move to git we will have an official master branch, which will be a single official branch. Thanks for the feedback. Jarrod From millman at berkeley.edu Wed May 26 23:49:34 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 26 May 2010 20:49:34 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 8:08 PM, Matthew Brett wrote: > That's the model we've gone for in nipy and ipython too. ?We wrote it > up in a workflow doc project. ?Here are the example docs giving the > git workflow for ipython: > > https://cirl.berkeley.edu/mb312/gitwash/ > > and in particular: > > https://cirl.berkeley.edu/mb312/gitwash/development_workflow.html I would highly recommend using this workflow. Ideally, we should use the same git workflow for all the scipy-related projects. That way developers can switch between projects without having to switch workflows. The model that Matthew and Fernando developed for nipy and ipython seem like a very reasonable place to start. From vincent at vincentdavis.net Wed May 26 23:52:22 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Wed, 26 May 2010 21:52:22 -0600 Subject: [Numpy-discussion] How to distinguish between number and string dypes Message-ID: How do I determine if an array's (or column in a structured array) dtype is a number or a string. I see how to determine the actual dtype but all I want to know is if it is a string or a number. *Vincent Davis 720-301-3003 * vincent at vincentdavis.net my blog | LinkedIn -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed May 26 23:58:18 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 21:58:18 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 7:47 PM, Benjamin Root wrote: > > > On Wed, May 26, 2010 at 8:37 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Wed, May 26, 2010 at 7:34 PM, Matthew Brett wrote: >> >>> Hi, >>> >>> > there is no such thing as a "nice" bash shell for a windows user. >>> > I have no idea how to use one. >>> >>> It is a nice bash shell. You may not want a nice bash shell ;) >>> >>> I can't imagine you'd object to one though. It's just a useful place >>> to type git commands, with file / directory path autocompletion, git >>> branch autocompletion and so on. >>> >>> >> Any shell on windows is a pain, if only because of the spaces in the >> filenames. >> > > Why? Can't you just escape the spaces with backslashes... oh, nevermind... > > Sure, no doubt the experience would cleanse my soul. But flagellation would be less painful. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Thu May 27 00:02:57 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 27 May 2010 00:02:57 -0400 Subject: [Numpy-discussion] How to distinguish between number and string dypes In-Reply-To: References: Message-ID: <9BB80762-C1F8-4CAC-AAED-6EF03E44752F@gmail.com> On May 26, 2010, at 11:52 PM, Vincent Davis wrote: > > How do I determine if an array's (or column in a structured array) dtype is a number or a string. I see how to determine the actual dtype but all I want to know is if it is a string or a number. Check `numpy.lib._iotools._is_string_like` From charlesr.harris at gmail.com Thu May 27 00:22:29 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 22:22:29 -0600 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 9:49 PM, Jarrod Millman wrote: > On Wed, May 26, 2010 at 8:08 PM, Matthew Brett > wrote: > > That's the model we've gone for in nipy and ipython too. We wrote it > > up in a workflow doc project. Here are the example docs giving the > > git workflow for ipython: > > > > https://cirl.berkeley.edu/mb312/gitwash/ > > > > and in particular: > > > > https://cirl.berkeley.edu/mb312/gitwash/development_workflow.html > > I would highly recommend using this workflow. Ideally, we should use > the same git workflow for all the scipy-related projects. That way > developers can switch between projects without having to switch > workflows. The model that Matthew and Fernando developed for nipy and > ipython seem like a very reasonable place to start. > __ > I wouldn't. Who is going to be the gate keeper and pull the stuff? No vacations for him/her, on 24 hour call, yes? They might as well run a dairy. And do we really want all pull requests cross-posted to the list? Linus works full time as gatekeeper for Linux and gets paid for the effort. I think a central repository model would work better for us. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Thu May 27 00:23:56 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 26 May 2010 21:23:56 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 7:38 PM, Benjamin Root wrote: > I wouldn't call myself a developer, but I have been wanting to contribute > recently.? I learned source control with svn, so I am much more comfortable > with it.? My one attempt at using git for a personal project ended in > failure. > > Then I discovered this guide, "Git-SVN Crash Course": > http://git.or.cz/course/svn.html > > I hope this would be useful to other subversioners like me who might be > hesistant to switch to git. Thanks for the link. If we move to git, we will also develop a suggested workflow and post it online so that anyone should be able to just cut-and-paste the git commands. As Matthew mentioned both ipython and nipy have adopted the same workflow: https://cirl.berkeley.edu/mb312/gitwash/development_workflow.html The idea of the above document is not to teach people how to use git in general, but just for the specific way git is used in the development workflow for nipy and ipython. If you have some time to look at the ipython/nipy workflow, it would be useful to know how helpful you think a document like this would be for SVNers switching to git. If you have any other suggestions for what the NEP should include, please let us know. Thanks, Jarrod PS. I am glad to hear that you are interested in contributing to NumPy development. If you are looking for a good place to start, you may want to consider helping with the 2010 summer documentation marathon or submitting a patch to address an open ticket. From matthew.brett at gmail.com Thu May 27 00:34:08 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 May 2010 21:34:08 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: Hi, >> I would highly recommend using this workflow. ?Ideally, we should use >> the same git workflow for all the scipy-related projects. ?That way >> developers can switch between projects without having to switch >> workflows. ?The model that Matthew and Fernando developed for nipy and >> ipython seem like a very reasonable place to start. >> __ > > I wouldn't. Who is going to be the gate keeper and pull the stuff? No > vacations for him/her, on 24 hour call, yes? They might as well run a dairy. > And do we really want all pull requests cross-posted to the list? Linus > works full time as gatekeeper for Linux and gets paid for the effort. I > think a central repository model would work better for us. This is just a gentle request - please - wait - and follow Anne's advice - we are smart and versatile and we can adapt. I'm guessing you haven't used git live in a project yet? I've noticed that - until you have got used to the git / DVCS workflow - it seems like it would cause problems. But I strongly encourage you to read these posts from Joel Spolsky http://www.joelonsoftware.com/items/2010/03/17.html and http://hginit.com/00.html The links are about mercurial, but apply equally to git. But the main point is - lots of teams have already switched, and the teams that have switched, never look back. I think no-one I know who has used git seriously for a week or two could imagine going back to the kind of model we need when using subversion. It's very difficult to explain (the posts above are a good attempt) but it's a very common experience. See you, Matthew From aarchiba at physics.mcgill.ca Thu May 27 00:34:41 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Thu, 27 May 2010 01:34:41 -0300 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On 27 May 2010 01:22, Charles R Harris wrote: > > > On Wed, May 26, 2010 at 9:49 PM, Jarrod Millman > wrote: >> >> On Wed, May 26, 2010 at 8:08 PM, Matthew Brett >> wrote: >> > That's the model we've gone for in nipy and ipython too. ?We wrote it >> > up in a workflow doc project. ?Here are the example docs giving the >> > git workflow for ipython: >> > >> > https://cirl.berkeley.edu/mb312/gitwash/ >> > >> > and in particular: >> > >> > https://cirl.berkeley.edu/mb312/gitwash/development_workflow.html >> >> I would highly recommend using this workflow. ?Ideally, we should use >> the same git workflow for all the scipy-related projects. ?That way >> developers can switch between projects without having to switch >> workflows. ?The model that Matthew and Fernando developed for nipy and >> ipython seem like a very reasonable place to start. >> __ > > I wouldn't. Who is going to be the gate keeper and pull the stuff? No > vacations for him/her, on 24 hour call, yes? They might as well run a dairy. > And do we really want all pull requests cross-posted to the list? Linus > works full time as gatekeeper for Linux and gets paid for the effort. I > think a central repository model would work better for us. I don't think this is as big a problem as it sounds. If the gatekeeper takes a week-long vacation, so what? People keep working on their changes independently and they can get merged when the gatekeeper gets around to it. If they want to accelerate the ultimate merging they can pull the central repository into their own and resolve all conflicts, so that the pull into the central repository goes smoothly. If the gatekeeper's away and the users want to swap patches, well, they just pull from each other's public git repositories. Anne > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Thu May 27 00:49:44 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 22:49:44 -0600 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 10:34 PM, Anne Archibald wrote: > On 27 May 2010 01:22, Charles R Harris wrote: > > > > > > On Wed, May 26, 2010 at 9:49 PM, Jarrod Millman > > wrote: > >> > >> On Wed, May 26, 2010 at 8:08 PM, Matthew Brett > > >> wrote: > >> > That's the model we've gone for in nipy and ipython too. We wrote it > >> > up in a workflow doc project. Here are the example docs giving the > >> > git workflow for ipython: > >> > > >> > https://cirl.berkeley.edu/mb312/gitwash/ > >> > > >> > and in particular: > >> > > >> > https://cirl.berkeley.edu/mb312/gitwash/development_workflow.html > >> > >> I would highly recommend using this workflow. Ideally, we should use > >> the same git workflow for all the scipy-related projects. That way > >> developers can switch between projects without having to switch > >> workflows. The model that Matthew and Fernando developed for nipy and > >> ipython seem like a very reasonable place to start. > >> __ > > > > I wouldn't. Who is going to be the gate keeper and pull the stuff? No > > vacations for him/her, on 24 hour call, yes? They might as well run a > dairy. > > And do we really want all pull requests cross-posted to the list? Linus > > works full time as gatekeeper for Linux and gets paid for the effort. I > > think a central repository model would work better for us. > > I don't think this is as big a problem as it sounds. If the gatekeeper > takes a week-long vacation, so what? People keep working on their > changes independently and they can get merged when the gatekeeper gets > around to it. If they want to accelerate the ultimate merging they can > pull the central repository into their own and resolve all conflicts, > so that the pull into the central repository goes smoothly. If the > gatekeeper's away and the users want to swap patches, well, they just > pull from each other's public git repositories. > > Linux has Linus, ipython has Fernando, nipy has... well, I'm sure it is somebody. Numpy and Scipy no longer have a central figure and I like it that way. There is no reason that DVCS has to inevitably lead to a central authority. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu May 27 00:55:56 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 May 2010 21:55:56 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: Hi, > Linux has Linus, ipython has Fernando, nipy has... well, I'm sure it is > somebody. Numpy and Scipy no longer have a central figure and I like it that > way. There is no reason that DVCS has to inevitably lead to a central > authority. I think I was trying to say that the way it looks as if it will be - before you try it - is very different from the way it actually is when you get there. Anne put the idea very well - but I still think it is very hard to understand, without trying it, just how liberating the workflow is from anxieties about central authorities and so on. You can just get on with what you want to do, talk with or merge from whoever you want, and the whole development process becomes much more fluid and productive. And I know that sounds chaotic but - it just works. Really really well. See you, Matthew From aarchiba at physics.mcgill.ca Thu May 27 01:06:59 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Thu, 27 May 2010 02:06:59 -0300 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On 27 May 2010 01:55, Matthew Brett wrote: > Hi, > >> Linux has Linus, ipython has Fernando, nipy has... well, I'm sure it is >> somebody. Numpy and Scipy no longer have a central figure and I like it that >> way. There is no reason that DVCS has to inevitably lead to a central >> authority. > > I think I was trying to say that the way it looks as if it will be - > before you try it - is very different from the way it actually is when > you get there. ? Anne put the idea very well - but I still think it is > very hard to understand, without trying it, just how liberating the > workflow is from anxieties about central authorities and so on. ? ?You > can just get on with what you want to do, talk with or merge from > whoever you want, and the whole development process becomes much more > fluid and productive. ? And I know that sounds chaotic but - it just > works. ?Really really well. One way to think of it is that there is no "main line" of development. The only time the central repository needs to pull from the others is when a release is being prepared. As it stands we do have a single release manager, though it's not necessarily the same for each version. So if we wanted, they could just go and pull and merge the repositories of everyone who's made a useful change, then release the results. Of course, this will be vastly easier if all those other people have already merged each other's results (into different branches if appropriate). But just like now, it's the release manager's decision which changes end up in the next version. This is not the only way to do git development; it's the only one I have experience with, so I can't speak for the effectiveness of others. But I have no doubt that we can find some way that works, and I don't think we necessarily need to decide what that is any time soon. Anne > See you, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Thu May 27 01:16:19 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 23:16:19 -0600 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 11:06 PM, Anne Archibald wrote: > On 27 May 2010 01:55, Matthew Brett wrote: > > Hi, > > > >> Linux has Linus, ipython has Fernando, nipy has... well, I'm sure it is > >> somebody. Numpy and Scipy no longer have a central figure and I like it > that > >> way. There is no reason that DVCS has to inevitably lead to a central > >> authority. > > > > I think I was trying to say that the way it looks as if it will be - > > before you try it - is very different from the way it actually is when > > you get there. Anne put the idea very well - but I still think it is > > very hard to understand, without trying it, just how liberating the > > workflow is from anxieties about central authorities and so on. You > > can just get on with what you want to do, talk with or merge from > > whoever you want, and the whole development process becomes much more > > fluid and productive. And I know that sounds chaotic but - it just > > works. Really really well. > > One way to think of it is that there is no "main line" of development. > The only time the central repository needs to pull from the others is > when a release is being prepared. As it stands we do have a single > release manager, though it's not necessarily the same for each > version. So if we wanted, they could just go and pull and merge the > repositories of everyone who's made a useful change, then release the > results. Of course, this will be vastly easier if all those other > people have already merged each other's results (into different > branches if appropriate). But just like now, it's the release > manager's decision which changes end up in the next version. > > No, at this point we don't have a release manager, we haven't since 1.2. We have people who do the builds and put them up on sourceforge, but they aren't release managers, they don't decide what is in the release or organise the effort. We haven't had a central figure since Travis got a real job ;) And now David has a real job too. I'm just pointing out that that projects like Linux and IPython have central figures because the originators are still active in the development. Let me put it this way, right now, who would you choose to pull the changes and release the official version? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at silveregg.co.jp Thu May 27 01:28:49 2010 From: david at silveregg.co.jp (David) Date: Thu, 27 May 2010 14:28:49 +0900 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: <4BFE0311.20903@silveregg.co.jp> On 05/27/2010 02:16 PM, Charles R Harris wrote: > > > On Wed, May 26, 2010 at 11:06 PM, Anne Archibald > > wrote: > > On 27 May 2010 01:55, Matthew Brett > wrote: > > Hi, > > > >> Linux has Linus, ipython has Fernando, nipy has... well, I'm > sure it is > >> somebody. Numpy and Scipy no longer have a central figure and I > like it that > >> way. There is no reason that DVCS has to inevitably lead to a > central > >> authority. > > > > I think I was trying to say that the way it looks as if it will be - > > before you try it - is very different from the way it actually is > when > > you get there. Anne put the idea very well - but I still think > it is > > very hard to understand, without trying it, just how liberating the > > workflow is from anxieties about central authorities and so on. > You > > can just get on with what you want to do, talk with or merge from > > whoever you want, and the whole development process becomes much more > > fluid and productive. And I know that sounds chaotic but - it just > > works. Really really well. > > One way to think of it is that there is no "main line" of development. > The only time the central repository needs to pull from the others is > when a release is being prepared. As it stands we do have a single > release manager, though it's not necessarily the same for each > version. So if we wanted, they could just go and pull and merge the > repositories of everyone who's made a useful change, then release the > results. Of course, this will be vastly easier if all those other > people have already merged each other's results (into different > branches if appropriate). But just like now, it's the release > manager's decision which changes end up in the next version. > > > No, at this point we don't have a release manager, we haven't since 1.2. > We have people who do the builds and put them up on sourceforge, but > they aren't release managers, they don't decide what is in the release > or organise the effort. We haven't had a central figure since Travis got > a real job ;) And now David has a real job too. I'm just pointing out > that that projects like Linux and IPython have central figures because > the originators are still active in the development. Let me put it this > way, right now, who would you choose to pull the changes and release the > official version? Ralf is the release manager, and for deciding what goes into the release, we do just as we do now. For small changes which do not warrant discussion, they would be handled through pull requests in github at first, but we can improve after that (for example having an automatic gatekeeper which only pulls something that would at least compile and pass the test on a linux machine). David From matthew.brett at gmail.com Thu May 27 01:34:01 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 May 2010 22:34:01 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: Hi, > No, at this point we don't have a release manager, we haven't since 1.2. We > have people who do the builds and put them up on sourceforge, but they > aren't release managers, they don't decide what is in the release or > organise the effort. We haven't had a central figure since Travis got a real > job ;) And now David has a real job too. I'm just pointing out that that > projects like Linux and IPython have central figures because the originators > are still active in the development. Let me put it this way, right now, who > would you choose to pull the changes and release the official version? OK - for nipy - we have - I think - 5 people who can commit into the main repository. Any one of those 5 people can review someone's work, and commit into the main repository. My guess is - with numpy - there would be some number of people with the same permissions - I imagine you among them. But the rule is - No-one commits into the main repo without someone reviewing and agreeing the work Any trusted person can review. But the point is: No development in the main repo. Merges only. Why? Let's flip your question the other way round. You are saying - I want to continue (as for SVN) to develop in the main repo. But the main repo is where everyone merges from. That means that a) It makes it much harder for anyone to review your changes because they are mixed up in a lot of other changes and b) You force everyone following numpy to adopt your changes In practice - that means that you make it harder for others by making them follow your line of development when they may not want to - until it's ready. I guess you'd agree that code review is essential to good code quality - both for improving code - and for teaching. It encourages new developers because they know their work will be checked. It helps developers learn the coding guidelines and to share good practice. It helps the developers have a broad knowledge of the code base. With SVN / central repo development - that's really hard - because all the development lines get mixed up as people work in different places. With git / DVCS - it suddenly becomes absolutely natural. I think that's why people like Joel Spolsy say stuff like 'This is possibly the biggest advance in software development technology in the ten years I?ve been writing articles here.' : http://www.joelonsoftware.com/items/2010/03/17.html Please - try it - see - I am absolutely sure you'll love it after a very short time... Matthew From charlesr.harris at gmail.com Thu May 27 01:34:39 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 23:34:39 -0600 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: <4BFE0311.20903@silveregg.co.jp> References: <4BFE0311.20903@silveregg.co.jp> Message-ID: On Wed, May 26, 2010 at 11:28 PM, David wrote: > On 05/27/2010 02:16 PM, Charles R Harris wrote: > > > > > > On Wed, May 26, 2010 at 11:06 PM, Anne Archibald > > > wrote: > > > > On 27 May 2010 01:55, Matthew Brett > > wrote: > > > Hi, > > > > > >> Linux has Linus, ipython has Fernando, nipy has... well, I'm > > sure it is > > >> somebody. Numpy and Scipy no longer have a central figure and I > > like it that > > >> way. There is no reason that DVCS has to inevitably lead to a > > central > > >> authority. > > > > > > I think I was trying to say that the way it looks as if it will be > - > > > before you try it - is very different from the way it actually is > > when > > > you get there. Anne put the idea very well - but I still think > > it is > > > very hard to understand, without trying it, just how liberating > the > > > workflow is from anxieties about central authorities and so on. > > You > > > can just get on with what you want to do, talk with or merge from > > > whoever you want, and the whole development process becomes much > more > > > fluid and productive. And I know that sounds chaotic but - it > just > > > works. Really really well. > > > > One way to think of it is that there is no "main line" of > development. > > The only time the central repository needs to pull from the others is > > when a release is being prepared. As it stands we do have a single > > release manager, though it's not necessarily the same for each > > version. So if we wanted, they could just go and pull and merge the > > repositories of everyone who's made a useful change, then release the > > results. Of course, this will be vastly easier if all those other > > people have already merged each other's results (into different > > branches if appropriate). But just like now, it's the release > > manager's decision which changes end up in the next version. > > > > > > No, at this point we don't have a release manager, we haven't since 1.2. > > We have people who do the builds and put them up on sourceforge, but > > they aren't release managers, they don't decide what is in the release > > or organise the effort. We haven't had a central figure since Travis got > > a real job ;) And now David has a real job too. I'm just pointing out > > that that projects like Linux and IPython have central figures because > > the originators are still active in the development. Let me put it this > > way, right now, who would you choose to pull the changes and release the > > official version? > > Ralf is the release manager, and for deciding what goes into the > release, we do just as we do now. For small changes which do not warrant > discussion, they would be handled through pull requests in github at > first, but we can improve after that (for example having an automatic > gatekeeper which only pulls something that would at least compile and > pass the test on a linux machine). > > So you are saying that Ralf has to manage all the pull requests? Have you asked Ralf about that? An automatic gatekeeper is pretty much a central repository, as I was suggesting. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at silveregg.co.jp Thu May 27 01:46:23 2010 From: david at silveregg.co.jp (David) Date: Thu, 27 May 2010 14:46:23 +0900 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: <4BFE0311.20903@silveregg.co.jp> Message-ID: <4BFE072F.2000803@silveregg.co.jp> On 05/27/2010 02:34 PM, Charles R Harris wrote: > An automatic gatekeeper is pretty much a > central repository, as I was suggesting. I don't understand how centraly repository comes into this discussion - nobody has been arguing against it. The question is whether we would continue to push individual commits to it directly (push), or we should present branches to a gatekeeper. I would suggest that you look on how people do it in projects using git, there are countless ressources on how to do it, and it has worked very well for pretty much every project. I can't see how numpy would be so different that it would require something different, especially without having tried it first. If the pull model really fails, then we can always change. cheers, David From charlesr.harris at gmail.com Thu May 27 01:48:59 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 23:48:59 -0600 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 11:34 PM, Matthew Brett wrote: > Hi, > > > No, at this point we don't have a release manager, we haven't since 1.2. > We > > have people who do the builds and put them up on sourceforge, but they > > aren't release managers, they don't decide what is in the release or > > organise the effort. We haven't had a central figure since Travis got a > real > > job ;) And now David has a real job too. I'm just pointing out that that > > projects like Linux and IPython have central figures because the > originators > > are still active in the development. Let me put it this way, right now, > who > > would you choose to pull the changes and release the official version? > > OK - for nipy - we have - I think - 5 people who can commit into the > main repository. Any one of those 5 people can review someone's work, > and commit into the main repository. My guess is - with numpy - > there would be some number of people with the same permissions - I > imagine you among them. But the rule is - > > No-one commits into the main repo without someone reviewing and > agreeing the work > > Any trusted person can review. But the point is: > > No development in the main repo. Merges only. > > Why? > > Let's flip your question the other way round. > > You are saying - I want to continue (as for SVN) to develop in the main > repo. > > No, I am saying we need at least five people who can commit to the main repo. That is the central repository model. > But the main repo is where everyone merges from. That means that > > a) It makes it much harder for anyone to review your changes because > they are mixed up in a lot of other changes and b) You force everyone following numpy to adopt your changes > > In practice - that means that you make it harder for others by making > them follow your line of development when they may not want to - until > it's ready. > > Review is fine, and it would be nice if more people were reviewing code. At the moment I think it is just Pauli, Stefan, and myself. I guess you'd agree that code review is essential to good code quality > - both for improving code - and for teaching. It encourages new > developers because they know their work will be checked. It helps > developers learn the coding guidelines and to share good practice. It > helps the developers have a broad knowledge of the code base. > > With SVN / central repo development - that's really hard - because all > the development lines get mixed up as people work in different places. > > But a repo that five folks can commit to *is* a central repository, by definition. DVCS and central repository are orthogonal concepts. With git / DVCS - it suddenly becomes absolutely natural. > > I think that's why people like Joel Spolsy say stuff like 'This is > possibly the biggest advance in software development technology in the > ten years I?ve been writing articles here.' : > http://www.joelonsoftware.com/items/2010/03/17.html > > Please - try it - see - I am absolutely sure you'll love it after a > very short time... > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu May 27 01:55:38 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 May 2010 22:55:38 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: Hi, > No, I am saying we need at least five people who can commit to the main > repo. That is the central repository model. Excellent - yes - that's reasonable. Then if you also agree to this: > No development in the main repo. Merges only. then we're all in full agreement. > Review is fine, and it would be nice if more people were reviewing code. At > the moment I think it is just Pauli, Stefan, and myself. Right - and that is partly because it so much harder to do review with the model that we have at the moment, and partly because we don't yet have the tradition in numpy of review. I think - honestly - if we're going to be able to encourage and train new developers - we'll have to get on that as soon as we can... See you, Matthew From charlesr.harris at gmail.com Thu May 27 01:58:00 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 May 2010 23:58:00 -0600 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 11:55 PM, Matthew Brett wrote: > Hi, > > > No, I am saying we need at least five people who can commit to the main > > repo. That is the central repository model. > > Excellent - yes - that's reasonable. Then if you also agree to this: > > > No development in the main repo. Merges only. > > then we're all in full agreement. > > > Review is fine, and it would be nice if more people were reviewing code. > At > > the moment I think it is just Pauli, Stefan, and myself. > > Right - and that is partly because it so much harder to do review with > the model that we have at the moment, and partly because we don't yet > have the tradition in numpy of review. I think - honestly - if we're > going to be able to encourage and train new developers - we'll have to > get on that as soon as we can... > > See you, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 27 02:05:42 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 May 2010 00:05:42 -0600 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 11:55 PM, Matthew Brett wrote: > Hi, > > > No, I am saying we need at least five people who can commit to the main > > repo. That is the central repository model. > > Excellent - yes - that's reasonable. Then if you also agree to this: > > > No development in the main repo. Merges only. > > then we're all in full agreement. > > How does that differ from what we do now? Review? I develop in my own branches as is. > > Review is fine, and it would be nice if more people were reviewing code. > At > > the moment I think it is just Pauli, Stefan, and myself. > > Right - and that is partly because it so much harder to do review with > the model that we have at the moment, and partly because we don't yet > have the tradition in numpy of review. I think - honestly - if we're > going to be able to encourage and train new developers - we'll have to > get on that as soon as we can... > > True, but what happens when there is no review? I might point out that there are currently tickets with patches for review going back two years and reviewing a patch isn't *that* much harder than visiting github. Using git makes merging changes much easier, but it doesn't solve the review problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu May 27 02:14:17 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 May 2010 23:14:17 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: Hi, > How does that differ from what we do now? Review? I develop in my own > branches as is. Right - so - then do you always ask for a review from someone before merging into trunk? If so, then git is just a much more fluid, reliable and faster tool to do what you are doing now. > True, but what happens when there is no review? I might point out that there > are currently tickets with patches for review going back two years and > reviewing a patch isn't *that* much harder than visiting github. Using git > makes merging changes much easier, but it doesn't solve the review problem. Well - that's true and not true. The joy of git branches and the ease of merging is that you quickly get into the habit of making feature branches for each piece of work. This makes it extremely easy for someone else to review the changes that you have made. So, it greatly lowers the work needed for someone to review your code, and therefore makes it more likely. Having said that - it will of course happen that you ask for review and no-one responds. That's not a very big problem, because git merges are so easy that you can - as Anne said earlier - just keep on developing without worrying that your changes will go out of date. But if there's a long wait - or it's urgent - then what I do is just email with 'If I don't hear anything I'll merge these changes in a few days'. See you, Matthew From charlesr.harris at gmail.com Thu May 27 02:27:14 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 May 2010 00:27:14 -0600 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 12:14 AM, Matthew Brett wrote: > Hi, > > > How does that differ from what we do now? Review? I develop in my own > > branches as is. > > Right - so - then do you always ask for a review from someone before > merging into trunk? If so, then git is just a much more fluid, > reliable and faster tool to do what you are doing now. > > > True, but what happens when there is no review? I might point out that > there > > are currently tickets with patches for review going back two years and > > reviewing a patch isn't *that* much harder than visiting github. Using > git > > makes merging changes much easier, but it doesn't solve the review > problem. > > Well - that's true and not true. The joy of git branches and the ease > of merging is that you quickly get into the habit of making feature > branches for each piece of work. This makes it extremely easy for > someone else to review the changes that you have made. So, it > greatly lowers the work needed for someone to review your code, and > therefore makes it more likely. > > Having said that - it will of course happen that you ask for review > and no-one responds. That's not a very big problem, because git > merges are so easy that you can - as Anne said earlier - just keep on > developing without worrying that your changes will go out of date. > But if there's a long wait - or it's urgent - then what I do is just > email with 'If I don't hear anything I'll merge these changes in a few > days'. > > Exactly. I had a private bet with myself that that would be the case. See, it isn't so much different after all. The tools change, but the problems and solutions remain much the same. Given that there are only three people doing reviews, and really only two really looking at the c code, I expect that a lot of stuff will be merged without much in the way of review. Now if git leads to more developers that might change. Here's hoping. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu May 27 02:33:03 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 May 2010 23:33:03 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: Hi, >> Having said that - it will of course happen that you ask for review >> and no-one responds. ?That's not a very big problem, because git >> merges are so easy that you can - as Anne said earlier - just keep on >> developing without worrying that your changes will go out of date. >> But if there's a long wait - or it's urgent - then what I do is just >> email with 'If I don't hear anything I'll merge these changes in a few >> days'. >> > > Exactly. I had a private bet with myself that that would be the case. See, > it isn't so much different after all. The tools change, but the problems and > solutions remain much the same. Given that there are only three people doing > reviews, and really only two really looking at the c code, I expect that a > lot of stuff will be merged without much in the way of review. Well - I do honestly think that a decentralized git workflow is the best tool to improve that. > Now if git leads to more developers that might change. Here's hoping. I hope so too. I accidentally ran across this a few days ago: http://www.erlang.org/ - "This [Erlang/OTP R13B04] is the first release after the introduction of the official Git repository at Github and it is amazing to notice that the number of contributions from the community has increased significantly. As many as 32 contributors have provided 1 or more patches each until now, resulting in 51 integrated patches from the open source community in this service release." Here's hoping... See you, Matthew From stefan at sun.ac.za Thu May 27 03:20:00 2010 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 27 May 2010 00:20:00 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On 26 May 2010 23:27, Charles R Harris wrote: > Exactly. I had a private bet with myself that that would be the case. See, > it isn't so much different after all. The tools change, but the problems and > solutions remain much the same. In this case, I believe the tool may be part of the solution. With limited manpower at our disposal, having a somewhat painful process certainly doesn't help. - Working with patches is unreliable (check out all the patches in Trac that don't apply cleanly and how much effort it will be to fix them). Distributed revision control provides a much better structure within which to manage patches. - Merging in SVN is horrible and will never encourage branches. Without branches, trunk becomes turbulent easily. - We currently don't have any code review in place. This isn't SVN's fault, but tools such as GitHub's compare view (http://github.com/blog/612-introducing-github-compare-view) look really promising Maybe most importantly, distributed revision control places any possible contributor on equal footing with those with commit access; this is one important step in making contributors feel valued. Regards St?fan From faltet at pytables.org Thu May 27 03:27:25 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 27 May 2010 09:27:25 +0200 Subject: [Numpy-discussion] How to distinguish between number and string dypes In-Reply-To: References: Message-ID: <201005270927.25802.faltet@pytables.org> A Thursday 27 May 2010 05:52:22 Vincent Davis escrigu?: > How do I determine if an array's (or column in a structured array) dtype is > a number or a string. I see how to determine the actual dtype but all I > want to know is if it is a string or a number. I suppose that the `.kind` attribute of dtype would help you: In [2]: s = np.dtype("S3") In [4]: s.kind Out[4]: 'S' In [5]: i = np.dtype("i4") In [6]: i.kind Out[6]: 'i' In [7]: f = np.dtype("f8") In [8]: f.kind Out[8]: 'f' -- Francesc Alted From charlesr.harris at gmail.com Thu May 27 03:43:51 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 May 2010 01:43:51 -0600 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: 2010/5/27 St?fan van der Walt > On 26 May 2010 23:27, Charles R Harris wrote: > > Exactly. I had a private bet with myself that that would be the case. > See, > > it isn't so much different after all. The tools change, but the problems > and > > solutions remain much the same. > > In this case, I believe the tool may be part of the solution. With > limited manpower at our disposal, having a somewhat painful process > certainly doesn't help. > > It should help. A commitment to doing reviews is probably more important here than submitting for review. It's less fun than development and takes a certain commitment. Of course, there are probably some perverts out there who find it enjoyable. I hope we find some. > - Working with patches is unreliable (check out all the patches in > Trac that don't apply cleanly and how much effort it will be to fix > them). Distributed revision control provides a much better structure > within which to manage patches. > > Two year old patches are always going to be a problem. The real fix here is not to let things languish. > - Merging in SVN is horrible and will never encourage branches. > Without branches, trunk becomes turbulent easily. > > True. Although there would need to be more activity to get to true turbulence. > - We currently don't have any code review in place. This isn't SVN's > fault, but tools such as GitHub's compare view > (http://github.com/blog/612-introducing-github-compare-view) look > really promising > > Maybe most importantly, distributed revision control places any > possible contributor on equal footing with those with commit access; > this is one important step in making contributors feel valued. > > Well, not quite. They can't commit to the main repository. I think the main thing is to be responsive: fast review, quick commit. And quick to offer commit rights to anyone who sends in more that a couple of decent patches. Maybe we should take a vow to review one patch a week. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Thu May 27 03:49:21 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Thu, 27 May 2010 09:49:21 +0200 Subject: [Numpy-discussion] Help Convolution with binaural filters(HRTFs) In-Reply-To: References: Message-ID: 2010/5/26 arthur de conihout : > i try to implement a real-time convolution module refreshed by head > listener location (angle from a reference point).The result of the > convolution by binaural flters(HRTFs) allows me to spatialize a monophonic > wavfile. I suspect noone not closely involved with your subject can understand this. From what you write later on I guess binaural filters are some LTI system? > I got trouble with this as long as my convolution doesnt seem to > work properly: > np.convolve() doesnt convolve the entire entry signal Hmm http://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html#numpy-convolve claims that the convolution is complete. Can you give an example of what you mean? Furthermore, I think the note there about scipy.signal.fftconcolve may be of large use for you, when you are going to convolve whole wav files? > ->trouble with extracting numpyarrays from the audio wav. filters and > monophonic entry > ->trouble then with encaspulating the resulting array in a proper wav > file...it is not read by audacity Hmmm I worked one time with wavs using the wave module, which is a standard module. I didn't deal with storing wavs. I attach the reading module for you. It needs a module mesh2 to import, which I don't include to save traffic. I think the code is understandable without it and the method .get_raw_by_frames() may already help solving your problem. But I didn't really get the point what your aim is. As far as I understood you want to do sth named "spacialise" with the audio, based on the position of some person with respect to some reference point. What means "spacialise" in this case? I guess it's not simply a delay for creating stereo impression? I guess it is sth creating also a room impression, sth like "small or large room"? Friedrich -------------- next part -------------- A non-text attachment was scrubbed... Name: wav.py Type: application/octet-stream Size: 5225 bytes Desc: not available URL: From stefan at sun.ac.za Thu May 27 05:10:59 2010 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 27 May 2010 02:10:59 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On 27 May 2010 00:43, Charles R Harris wrote: > Well, not quite. They can't commit to the main repository. I think the main > thing is to be responsive: fast review, quick commit. And quick to offer > commit rights to anyone who sends in more that a couple of decent patches. At the moment, giving a developer commit access is a nebulous process; not exactly encouraging. I agree with you when you say that we should commit to doing reviews, not to let patches languish, etc. But on top of that, I believe that we should make this easy, inviting and fun; a big part of that is finding the right tool for the job. Remember those days when Trac was horribly broken? That certainly made hacking unpleasant. We upgraded and reconfigured, and now issue tracking is a lot more palatable. Even more painfully, we'll soon be heading for a series of big merges (numpy core refactor); who wants to do those using SVN? Regards St?fan From aarchiba at physics.mcgill.ca Thu May 27 05:28:34 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Thu, 27 May 2010 06:28:34 -0300 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On 27 May 2010 04:43, Charles R Harris wrote: > >> Maybe most importantly, distributed revision control places any >> possible contributor on equal footing with those with commit access; >> this is one important step in making contributors feel valued. >> > > Well, not quite. They can't commit to the main repository. I think the main > thing is to be responsive: fast review, quick commit. And quick to offer > commit rights to anyone who sends in more that a couple of decent patches. > Maybe we should take a vow to review one patch a week. Okay. Suppose we wanted to replicate the current permissions arrangement as closely as possible with git. It seems to me it would look something like this: * Set up a git repository somewhere on scipy.org. * Give everyone who currently has permission to commit to SVN permission to write to this repository. * git submissions would become possible: a user would make some changes but instead of posting a patch would link to a particular git state. The changes could be reviewed and incorporated like a patch, but with easier merging and better history. If the changes became out of date the user could easily merge from the central repository and resolve the conflict themselves. * Patch submissions would be reviewed as now and committed to git by one of the people who do this now. Alternatively they could be integrated to the mainline by someone without write access and published as a git change, to be incorporated (easily) as above by someone with write access. * if review and inclusion were slow it would nevertheless be easy for users to pull from each other and build on each other's changes without making the eventual merge a nightmare. So, no major change to who controls what. The nipy/ipython model takes this a step further, reasoning that git makes branching and merging so easy there's no need for such a large group of people with write access to the central repository, but if that doesn't work for numpy/scipy we don't need to do it. And we can change in either direction at any time with no major changes to infrastructure or workflow. To get back to the original point of the thread: nobody has yet objected to git, and all we have are some debates about the ultimate workflow that don't make much difference to whether or how git should be adopted. Is this a fair description? Anne From millman at berkeley.edu Thu May 27 05:40:46 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 27 May 2010 02:40:46 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 2:28 AM, Anne Archibald wrote: > To get back to the original point of the thread: nobody has yet > objected to git, and all we have are some debates about the ultimate > workflow that don't make much difference to whether or how git should > be adopted. Is this a fair description? Yes, that is my take on it. Since it seems that everyone is open to *discuss* moving to git/github, Stefan and I will draft a NEP for this transition. Stefan is currently visiting Berkeley, so we can easily work together on this over the next few days. However, we are going camping this weekend so we will be off-line more or less from Thursday night until Monday night. We will start the git/github NEP during the trip and then post it to the list for feedback and discussion on Monday night or Tuesday morning. If anyone else is interested in helping draft the NEP over the weekend, please let me know ASAP. We will raise and address as many concerns as possible. I believe the concerns raised so far can be satisfactorily addressed and hopefully the process of writing the NEP will let us systematically explore any potential concerns or problems. Here is a quick list of topics we will address in the NEP: - Client (Windows, Mac, and Linux) support - Issue tracking system integration - Buildbot interaction - Workflow - Legacy support for svn clients - Testing and deployment - Potential timeline If you have any other areas of concern you would like to see addressed, please let us know. Obviously, the weekend draft will be subject to change according to the feedback. Thanks, Jarrod From millman at berkeley.edu Thu May 27 05:49:54 2010 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 27 May 2010 02:49:54 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 2:28 AM, Anne Archibald wrote: > * Set up a git repository somewhere on scipy.org. It's a minor point, but setting up and maintaining our own git repository will require extra work without gaining anything useful. Github has a number of very useful features and is gaining new functionality all the time. It also greatly simplifies account management, which is a royal pain with our current system. Obviously this is a separate issue from whether we move to git or not, but I just wanted to address it quickly. I've registered the following github accounts just to reserve them for now: http://github.com/numpy http://github.com/scipy http://github.com/scikits Jarrod From arthurdeconihout at gmail.com Thu May 27 07:00:14 2010 From: arthurdeconihout at gmail.com (arthur de conihout) Date: Thu, 27 May 2010 13:00:14 +0200 Subject: [Numpy-discussion] Help Convolution with binaural filters(HRTFs) In-Reply-To: References: Message-ID: Hi thanks for your answer It s my first time i get involved in such a forum, i m a raw recruit and i don't exactly know how to post properly. I try to make me clearer on my project : ""But I didn't really get the point what your aim is. As far as I understood you want to do sth named "spacialise" with the audio, based on the position of some person with respect to some reference point. What means "spacialise" in this case? I guess it's not simply a delay for creating stereo impression? I guess it is sth creating also a room impression, sth like "small or large room"?"" The project consists in Binaural Synthesis( that s the given name), which is a solution for Sound Spatialzation which is the closest to rea-life listening. Thus binaural encoding of spatial information is fundamentaly based on the synthesis of localisation cues, namely the ITD(Interaural Time Difference), ILD(Interaural Level Difference)and the SC(Spectral Cues). A way to create an artificial sound scene is by using binaural filters.The binaural signals are then obtained by convolving a monophonic source signal with a pair of binaural filters that reproduce the transfer function of the acoustic path between the source location and the listener's ears .These transfer functions are refered to as Head Related Transfer Functions or HRTF( their time equivalent HRIR for Head Related Impulse Response). These HRTF can be obtained by measurement.The protocol consists in setting very small microphones in the ears of a listenner and to send for each direction in his acoustic sphere a white noise.Thus we obtained the impulse response to a white noise which correponds to the acoustic signature of the sound in a given direction(all this is made in anechoic room to consider a neutral room without reverb).Then to "spatialize" the sound i convolve two of these IR(left and right ear response) with the monophonic sound and i obtain that the sound seems to come from the given direction.The decoding part use a headphone to eliminate problem with the room response by making the recording point of the HRTF closer to the reproduction point (headphone). I give you a part of the code i use for convolution with all the wav treatment: import wave, struct, numpy, time SAMPLE_RATE = 88200 *#here i got trouble if i set 44100 instead the final wav is under pitched even if the original was 44100?* def convo(foriginal, ffiltre): original = wave.open(foriginal, "r") filtre = wave.open(ffiltre, "r") * # i m creating the file in which i will write the result of the convolution* filtered = wave.open("/home/arthur/Desktop/NubiaFilteredFIRnoteR.wav", "w") filtered.setnchannels(1) filtered.setsampwidth(2) filtered.setframerate(SAMPLE_RATE) *#when i unpack the monophonic and the filter i might be making a mistake with the arguments that make my convolution not be performed on the entire signal? ** #getting wav mono file info to unpack the data properly* nframes = original.getnframes() nchannels=original.getnchannels() original = struct.unpack_from("%dh" % nframes*nchannels, original.readframes(nframes*nchannels)) *#i dont really understand the %dh and the s/2.0**15 but it might be my problem * original = [s / 2.0**15 for s in original] nframes=filtre.getnframes() nchannels=filtre.getnchannels() filtre = struct.unpack_from("%dh" % nframes*nchannels, filtre.readframes(nframes*nchannels)) filtre = [s / 2.0**15 for s in filtre] result = numpy.convolve(original, filtre) result = [ sample * 2.0**15 for sample in result ] filtered.writeframes(struct.pack('%dh' % len(result), *result)) filtered.close() convo(foriginal, ffiltre) i had a look to what you sent me i m on my way understanding maybe your initialisation tests will allow me to make difference between every wav formats? i want to be able to encode every formats (16bit unsigned, 32bits)what precautions do i have to respect in the filtering?do filter and original must be the same or? Thank you AdeC 2010/5/27 Friedrich Romstedt > 2010/5/26 arthur de conihout : > > i try to implement a real-time convolution module refreshed by head > > listener location (angle from a reference point).The result of the > > convolution by binaural flters(HRTFs) allows me to spatialize a > monophonic > > wavfile. > > I suspect noone not closely involved with your subject can understand > this. From what you write later on I guess binaural filters are some > LTI system? > > > I got trouble with this as long as my convolution doesnt seem to > > work properly: > > np.convolve() doesnt convolve the entire entry signal > > Hmm > http://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html#numpy-convolve > claims that the convolution is complete. Can you give an example of > what you mean? > > Furthermore, I think the note there about scipy.signal.fftconcolve may > be of large use for you, when you are going to convolve whole wav > files? > > > ->trouble with extracting numpyarrays from the audio wav. filters and > > monophonic entry > > ->trouble then with encaspulating the resulting array in a proper wav > > file...it is not read by audacity > > Hmmm I worked one time with wavs using the wave module, which is a > standard module. I didn't deal with storing wavs. I attach the > reading module for you. It needs a module mesh2 to import, which I > don't include to save traffic. I think the code is understandable > without it and the method .get_raw_by_frames() may already help > solving your problem. > > But I didn't really get the point what your aim is. As far as I > understood you want to do sth named "spacialise" with the audio, based > on the position of some person with respect to some reference point. > What means "spacialise" in this case? I guess it's not simply a delay > for creating stereo impression? I guess it is sth creating also a > room impression, sth like "small or large room"? > > Friedrich > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu May 27 07:20:40 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 27 May 2010 19:20:40 +0800 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: <4BFE0311.20903@silveregg.co.jp> Message-ID: On Thu, May 27, 2010 at 1:34 PM, Charles R Harris wrote: > > > On Wed, May 26, 2010 at 11:28 PM, David wrote: > >> On 05/27/2010 02:16 PM, Charles R Harris wrote: >> > >> > >> > On Wed, May 26, 2010 at 11:06 PM, Anne Archibald >> > > wrote: >> > >> > On 27 May 2010 01:55, Matthew Brett > > > wrote: >> > > Hi, >> > > >> > >> Linux has Linus, ipython has Fernando, nipy has... well, I'm >> > sure it is >> > >> somebody. Numpy and Scipy no longer have a central figure and I >> > like it that >> > >> way. There is no reason that DVCS has to inevitably lead to a >> > central >> > >> authority. >> > > >> > > I think I was trying to say that the way it looks as if it will >> be - >> > > before you try it - is very different from the way it actually is >> > when >> > > you get there. Anne put the idea very well - but I still think >> > it is >> > > very hard to understand, without trying it, just how liberating >> the >> > > workflow is from anxieties about central authorities and so on. >> > You >> > > can just get on with what you want to do, talk with or merge from >> > > whoever you want, and the whole development process becomes much >> more >> > > fluid and productive. And I know that sounds chaotic but - it >> just >> > > works. Really really well. >> > >> > One way to think of it is that there is no "main line" of >> development. >> > The only time the central repository needs to pull from the others >> is >> > when a release is being prepared. As it stands we do have a single >> > release manager, though it's not necessarily the same for each >> > version. So if we wanted, they could just go and pull and merge the >> > repositories of everyone who's made a useful change, then release >> the >> > results. Of course, this will be vastly easier if all those other >> > people have already merged each other's results (into different >> > branches if appropriate). But just like now, it's the release >> > manager's decision which changes end up in the next version. >> > >> > >> > No, at this point we don't have a release manager, we haven't since 1.2. >> > We have people who do the builds and put them up on sourceforge, but >> > they aren't release managers, they don't decide what is in the release >> > or organise the effort. We haven't had a central figure since Travis got >> > a real job ;) And now David has a real job too. I'm just pointing out >> > that that projects like Linux and IPython have central figures because >> > the originators are still active in the development. Let me put it this >> > way, right now, who would you choose to pull the changes and release the >> > official version? >> >> Ralf is the release manager, and for deciding what goes into the >> release, we do just as we do now. For small changes which do not warrant >> discussion, they would be handled through pull requests in github at >> first, but we can improve after that (for example having an automatic >> gatekeeper which only pulls something that would at least compile and >> pass the test on a linux machine). >> >> > So you are saying that Ralf has to manage all the pull requests? > I'd hope not. For the record, I really like the development model Matthew proposed. About deciding what goes into a release, I'm sure that David meant small stuff like "this can't go in, it's too late in the release cycle" or "this code needs tests if you want it to be in this release". Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu May 27 07:20:51 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 27 May 2010 20:20:51 +0900 Subject: [Numpy-discussion] Help Convolution with binaural filters(HRTFs) In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 10:43 PM, arthur de conihout wrote: >> >> >>> Hi, >>> i try to implement a real-time convolution module refreshed by head >>> listener location (angle from a reference point).The result of the >>> convolution by binaural flters(HRTFs) allows me to spatialize a monophonic >>> wavfile. I got trouble with this as long as my convolution doesnt seem to >>> work properly: >>> np.convolve() doesnt convolve the entire entry signal >>> ->trouble with extracting numpyarrays from the audio wav. filters and >>> monophonic entry >>> ->trouble then with encaspulating the resulting array in a proper wav >>> file...it is not read by audacity >>> Do you have any idea of how this could work or do you have any >>> implementation of stereo filtering by impulse response to submit me For reading audio files into numpy, I suggest you use audiolab. It uses libsndfile underneath, which handles various wav format *really* well (and is most likely the one used by audacity): http://pypi.python.org/pypi/scikits.audiolab/ Concerning the filtering part, filtering in the time domain is way too consuming, because HRTF impulse responses are quite long, so you should use FFT (with windowing of course, using overlap add methods) David From cournape at gmail.com Thu May 27 07:23:41 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 27 May 2010 20:23:41 +0900 Subject: [Numpy-discussion] Help Convolution with binaural filters(HRTFs) In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 8:00 PM, arthur de conihout wrote: > #i dont really understand the %dh and the s/2.0**15 but it might be my > problem > ?original = [s / 2.0**15 for s in original] This is done because wav file are (usually, not always) in fixed point, with values in the unsigned 16 bits int range (~ [-32768, 32768]), but when you want to do processing in floating point (as does numpy), you want the values normalized (in the [-1, 1] range). 2 * 15 gives you the normalization factor. But audiolab does this for you automatically, David From ralf.gommers at googlemail.com Thu May 27 07:51:42 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 27 May 2010 19:51:42 +0800 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Wed, May 26, 2010 at 12:23 PM, Travis Oliphant wrote: > > On May 25, 2010, at 5:06 PM, David Cournapeau wrote: > > > On Wed, May 26, 2010 at 6:19 AM, Charles R Harris > > wrote: > > > >> Sounds good, but what if it doesn't get finished in a few months? I > think we > >> should get 2.0.0 out pronto, ideally it would already have been > released. I > >> think a major refactoring like this proposal should get the 3.0.0 label. > > > > Naming it 3.0 or 2.1 does not matter much - I think we should avoid > > breaking things twice. I can see a few solutions: > > - postpone 2.0 "indefinitely", until this new work is done > > - backport py3k changes to 1.5 (which would be API and ABI > > compatible with 1.4.1), and 2.0 would contain all the breaking > > changes. > > This is an interesting idea and also workable. > > > > > I am really worried about breaking things once now and once in a few > > months (or even a year). > > I am too. That's why this discussion. We will have the NumPy refactor > done by end of July at the latest. Numpy 2.0 should be able to come out in > August. > > This thread got a bit side-tracked with the move to git, but I don't see a conclusion about what to release when. Even if the refactoring is done in July, I think a 2.0 release with so many major changes will probably need a longer test/release cycle. So if we say September, do you still want a 1.5 release? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent at vincentdavis.net Thu May 27 09:02:58 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Thu, 27 May 2010 07:02:58 -0600 Subject: [Numpy-discussion] How to distinguish between number and string dypes In-Reply-To: <201005270927.25802.faltet@pytables.org> References: <201005270927.25802.faltet@pytables.org> Message-ID: On Thu, May 27, 2010 at 1:27 AM, Francesc Alted wrote: > A Thursday 27 May 2010 05:52:22 Vincent Davis escrigu?: > > How do I determine if an array's (or column in a structured array) dtype > is > > a number or a string. I see how to determine the actual dtype but all I > > want to know is if it is a string or a number. > > I suppose that the `.kind` attribute of dtype would help you: > > In [2]: s = np.dtype("S3") > > In [4]: s.kind > Out[4]: 'S' > > In [5]: i = np.dtype("i4") > > In [6]: i.kind > Out[6]: 'i' > > In [7]: f = np.dtype("f8") > > In [8]: f.kind > Out[8]: 'f' > I know about this but the problem is that while the fist example is usable, the others are not great because to know that it is a number I would need to do something like(see below) but I might miss a number dtype, def is_number(obj): if obj.dtype.kind in ('i', 'f',..): return True Pierre GM "Check `numpy.lib._iotools._is_string_like`" This is ok, but I am having problem making it work, I keep getting an error that I am giving it 2 items and it only takes 1. Obviously I think I am giving it 1. This of course tells me if it is string like but not that "is" a number. Thanks Vincent > -- > Francesc Alted > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > *Vincent Davis 720-301-3003 * vincent at vincentdavis.net my blog | LinkedIn -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Thu May 27 09:03:40 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Thu, 27 May 2010 15:03:40 +0200 Subject: [Numpy-discussion] Help Convolution with binaural filters(HRTFs) In-Reply-To: References: Message-ID: 2010/5/27 arthur de conihout : > I try to make me clearer on my project : > [...] I think I understood now. Thank you for explanation. > ?original = [s / 2.0**15 for s in original] > > ??? nframes=filtre.getnframes() > ??? nchannels=filtre.getnchannels() > ??? filtre = struct.unpack_from("%dh" % nframes*nchannels, > filtre.readframes(nframes*nchannels)) > ??? filtre = [s / 2.0**15 for s in filtre] > > ??? result = numpy.convolve(original, filtre) Additionally to what David pointed out, it should also suffice to normalise only the impulse response channel. Furthermore, I would spend some thought on what "normalised" actually means. And I think it can only be understood in Fourier domain. When taking the conservation of energy into account, i.e. the conservation of the L2 norm of the impulse response function under Fourier transformation, it can be normalised in time domain also by setting the time-domain L2 norm to unity. (This is already much different from the maximum normalisation.) Then the energy content of the signal before and after the processing is identical. I.e., it emphasises some frequencies in favour of others which are diminished in volume. A better approach might be in my opinion to use the (already directly) determined transfer function. This can be normalised by the power of the input signal, which can e.g. be determined by a reference measurement without any obstruction in defined distance, I would say. > ??? result = [ sample * 2.0**15 for sample in result ] > ??? filtered.writeframes(struct.pack('%dh' % len(result), *result)) It's to me too a mystery why you observe this octave jump you reported on. I guess it's an octave, because you can compensate by doubling the sampling rate. Can you check whether your playback program loads the output file as single or double channel? Also I guess some relation between your observation of "incomplete convolution" and this pitch change. And now, I'm very irritated by the way you handle multiple channels in the input file. Actually you load maybe a two-channel input file, extract *all* the data, i.e. data from ch1, ch2, ch1, ch2, ..., and then convolve this? For single-channel input, this is correct, but when your input data is two-channel, it explains on the one hand why your program maybe doesn't work properly and on the other why you have pitch-halfening (you double the length of each frame). It is I think possible that you didn't notice more strange phenomenon, since the impulse response function is applied to each datum individually. > i had a look to what you sent me i m on my way understanding maybe your > initialisation tests will allow me to make difference between every? wav > formats? Exactly! It should be able to decode at least most wavs, with different samp widhts and different channel number. But I think I will myself in future hold to David's advice ... > i want to be able to encode every formats (16bit unsigned, 32bits)what > precautions do i have to respect in the filtering?do filter and original > must be the same or? > Thank you All problems would go away when you use the transfer function directly. It should be present as some function, which can easily be interpolated to the frequency points your FFT of the input signal yields. Interpolating the time-domain impulse response is not a good idea, since it assumes already frequency-boundedness, which is not necessarily fulfilled, I guess. Also it's much more complicated. When you measure transfer function, how can you reconstruct a unique impulse response? Do you measure also phases? When you shine in with white noise, do autocorrelation of the result and Fourier transform, the phases of the transfer function are lost, you only obtain amplitudes (squared). Of course it is the same to multiply in Fourier space with the transfer function as to convolve with its Fourier-back transform, but I'm not yet convinced that this is a good idea to do it in time domain tough ... Can you maybe give some hint? This means you essentially convolve with the autocorrelation function itself - but this one is symmetric and therefore not causal. I think it's strange when sound starts before it starts in the input ... the impulse response must be causal. Anyway, this problem remains also when you multiply with the plain real numbers in Fourier domain which result from the Fourier transform of the autocorrelation function - it's still acausal. The only way to circumvent this I'm seeing currently is to measure the phases of the transfer function too - i.e. to do a frequency sweep, white noise autocorrelation is then as far as I think not sufficient. Sorry, I feel it didn't came out very clear ... Friedrich From cournape at gmail.com Thu May 27 09:42:30 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 27 May 2010 22:42:30 +0900 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Thu, May 27, 2010 at 8:51 PM, Ralf Gommers wrote: > > > On Wed, May 26, 2010 at 12:23 PM, Travis Oliphant > wrote: >> >> On May 25, 2010, at 5:06 PM, David Cournapeau wrote: >> >> > On Wed, May 26, 2010 at 6:19 AM, Charles R Harris >> > wrote: >> > >> >> Sounds good, but what if it doesn't get finished in a few months? I >> >> think we >> >> should get 2.0.0 out pronto, ideally it would already have been >> >> released. I >> >> think a major refactoring like this proposal should get the 3.0.0 >> >> label. >> > >> > Naming it 3.0 or 2.1 does not matter much - I think we should avoid >> > breaking things twice. I can see a few solutions: >> > ?- postpone 2.0 "indefinitely", until this new work is done >> > ?- backport py3k changes to 1.5 (which would be API and ABI >> > compatible with 1.4.1), and 2.0 would contain all the breaking >> > changes. >> >> This is an interesting idea and also workable. >> >> > >> > I am really worried about breaking things once now and once in a few >> > months (or even a year). >> >> I am too. ?That's why this discussion. ? ?We will have the NumPy refactor >> done by end of July at the latest. ? Numpy 2.0 should be able to come out in >> August. >> > This thread got a bit side-tracked with the move to git, but I don't see a > conclusion about what to release when. My understanding is that there was a general agreement on splitting the code that breaks the ABI/API (datetime, maybe future refactoring) from everything else (mostly py3k port): - 1.5: everything in the trunk minus API/ABI breaking stuff - 2.0: still in flux, depends on how the refactoring will happen cheers, David From arthurdeconihout at gmail.com Thu May 27 10:31:36 2010 From: arthurdeconihout at gmail.com (arthur de conihout) Date: Thu, 27 May 2010 16:31:36 +0200 Subject: [Numpy-discussion] Help Convolution with binaural filters(HRTFs) In-Reply-To: References: Message-ID: ""Can you maybe give some hint?"" The most commonly used model for HRTF implementation is the one refered to as "minimum phase filter and pure delay".It is composed of: -a minimum-phase filter, which accounts for the magnitude spectrum of HRTF -and a pure delay , which represents the temporal information contained in the HRTF If H(f) is the HRTF to be implemented , the corresponding Hminphase is given by: 2010/5/27 Friedrich Romstedt > 2010/5/27 arthur de conihout : > > I try to make me clearer on my project : > > [...] > > I think I understood now. Thank you for explanation. > > > original = [s / 2.0**15 for s in original] > > > > nframes=filtre.getnframes() > > nchannels=filtre.getnchannels() > > filtre = struct.unpack_from("%dh" % nframes*nchannels, > > filtre.readframes(nframes*nchannels)) > > filtre = [s / 2.0**15 for s in filtre] > > > > result = numpy.convolve(original, filtre) > > Additionally to what David pointed out, it should also suffice to > normalise only the impulse response channel. > > Furthermore, I would spend some thought on what "normalised" actually > means. And I think it can only be understood in Fourier domain. When > taking the conservation of energy into account, i.e. the conservation > of the L2 norm of the impulse response function under Fourier > transformation, it can be normalised in time domain also by setting > the time-domain L2 norm to unity. (This is already much different > from the maximum normalisation.) Then the energy content of the > signal before and after the processing is identical. I.e., it > emphasises some frequencies in favour of others which are diminished > in volume. > > A better approach might be in my opinion to use the (already directly) > determined transfer function. This can be normalised by the power of > the input signal, which can e.g. be determined by a reference > measurement without any obstruction in defined distance, I would say. > > > result = [ sample * 2.0**15 for sample in result ] > > filtered.writeframes(struct.pack('%dh' % len(result), *result)) > > It's to me too a mystery why you observe this octave jump you reported > on. I guess it's an octave, because you can compensate by doubling > the sampling rate. Can you check whether your playback program loads > the output file as single or double channel? Also I guess some > relation between your observation of "incomplete convolution" and this > pitch change. > > And now, I'm very irritated by the way you handle multiple channels in > the input file. Actually you load maybe a two-channel input file, > extract *all* the data, i.e. data from ch1, ch2, ch1, ch2, ..., and > then convolve this? For single-channel input, this is correct, but > when your input data is two-channel, it explains on the one hand why > your program maybe doesn't work properly and on the other why you have > pitch-halfening (you double the length of each frame). It is I think > possible that you didn't notice more strange phenomenon, since the > impulse response function is applied to each datum individually. > > > i had a look to what you sent me i m on my way understanding maybe your > > initialisation tests will allow me to make difference between every wav > > formats? > > Exactly! It should be able to decode at least most wavs, with > different samp widhts and different channel number. But I think I > will myself in future hold to David's advice ... > > > i want to be able to encode every formats (16bit unsigned, 32bits)what > > precautions do i have to respect in the filtering?do filter and original > > must be the same or? > > Thank you > > All problems would go away when you use the transfer function > directly. It should be present as some function, which can easily be > interpolated to the frequency points your FFT of the input signal > yields. Interpolating the time-domain impulse response is not a good > idea, since it assumes already frequency-boundedness, which is not > necessarily fulfilled, I guess. Also it's much more complicated. > > When you measure transfer function, how can you reconstruct a unique > impulse response? Do you measure also phases? When you shine in with > white noise, do autocorrelation of the result and Fourier transform, > the phases of the transfer function are lost, you only obtain > amplitudes (squared). Of course it is the same to multiply in Fourier > space with the transfer function as to convolve with its Fourier-back > transform, but I'm not yet convinced that this is a good idea to do it > in time domain tough ... Can you maybe give some hint? > > This means you essentially convolve with the autocorrelation function > itself - but this one is symmetric and therefore not causal. I think > it's strange when sound starts before it starts in the input ... the > impulse response must be causal. Anyway, this problem remains also > when you multiply with the plain real numbers in Fourier domain which > result from the Fourier transform of the autocorrelation function - > it's still acausal. The only way to circumvent this I'm seeing > currently is to measure the phases of the transfer function too - i.e. > to do a frequency sweep, white noise autocorrelation is then as far as > I think not sufficient. > > Sorry, I feel it didn't came out very clear ... > > Friedrich > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Thu May 27 10:39:10 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 27 May 2010 07:39:10 -0700 Subject: [Numpy-discussion] How to distinguish between number and string dypes In-Reply-To: References: <201005270927.25802.faltet@pytables.org> Message-ID: On Thu, May 27, 2010 at 6:02 AM, Vincent Davis wrote: > > On Thu, May 27, 2010 at 1:27 AM, Francesc Alted wrote: >> >> A Thursday 27 May 2010 05:52:22 Vincent Davis escrigu?: >> > How do I determine if an array's (or column in a structured array) dtype is >> > a number or a string. I see how to determine the actual dtype but all I >> > ?want to know is if it is a string or a number. >> >> I suppose that the `.kind` attribute of dtype would help you: >> >> In [2]: s = np.dtype("S3") >> >> In [4]: s.kind >> Out[4]: 'S' >> >> In [5]: i = np.dtype("i4") >> >> In [6]: i.kind >> Out[6]: 'i' >> >> In [7]: f = np.dtype("f8") >> >> In [8]: f.kind >> Out[8]: 'f' > > I know about this but the problem is that while the fist example is usable, the others are not great because to know that it is a number I would need to do something like(see below) but I might miss a number dtype, > def is_number(obj): > ?? ?if obj.dtype.kind in ('i', 'f',..): > ?? ? ? ?return True >> >> Pierre GM "Check `numpy.lib._iotools._is_string_like`" > > This is ok, but I am having problem making it work, I keep getting an error that I am giving it 2 items and it only takes 1. Obviously I think I am giving it 1. This of course tells me if it is string like but not that "is" a number. > Thanks > Vincent To see if it is a number could you use something like: np.issubdtype(a.dtype, float) or np.issubdtype(a.dtype, int) or np.issubdtype(a.dtype, complex) And for string: np.issubdtype(a.dtype, str) From arthurdeconihout at gmail.com Thu May 27 10:42:04 2010 From: arthurdeconihout at gmail.com (arthur de conihout) Date: Thu, 27 May 2010 16:42:04 +0200 Subject: [Numpy-discussion] Help Convolution with binaural filters(HRTFs) In-Reply-To: References: Message-ID: sorry for the suspense... mod(Hmin(f))= mod(H(f)) phase(Hmin(f))=Im(HilbertTransform(-log(mod(H(f)))) the pure delay is computed by estimating the HRTF (or HRIR) delay.For convenience implementation of one single delay is prefereed corresponding to the difference delay and is applied to the contralateral ear (opposed to the signal) This is common implementation but in my opinion you can also choose to convolve HRIR directly with the monophonic sound it will take into account the delay since the measure of right and left are synchronised and when you plot it you can see the Interaural Delay.It doesnt take in account evolution of the delay with the frequency which will lead to spectral coloration but in the case of my practical application i can omit it. 2010/5/27 arthur de conihout > ""Can you maybe give some hint?"" > The most commonly used model for HRTF implementation is the one refered to > as "minimum phase filter and pure delay".It is composed of: > -a minimum-phase filter, which accounts for the magnitude spectrum of HRTF > -and a pure delay , which represents the temporal information contained in > the HRTF > > If H(f) is the HRTF to be implemented , the corresponding Hminphase is > given by: > > > 2010/5/27 Friedrich Romstedt > >> 2010/5/27 arthur de conihout : >> >> > I try to make me clearer on my project : >> > [...] >> >> I think I understood now. Thank you for explanation. >> >> > original = [s / 2.0**15 for s in original] >> > >> > nframes=filtre.getnframes() >> > nchannels=filtre.getnchannels() >> > filtre = struct.unpack_from("%dh" % nframes*nchannels, >> > filtre.readframes(nframes*nchannels)) >> > filtre = [s / 2.0**15 for s in filtre] >> > >> > result = numpy.convolve(original, filtre) >> >> Additionally to what David pointed out, it should also suffice to >> normalise only the impulse response channel. >> >> Furthermore, I would spend some thought on what "normalised" actually >> means. And I think it can only be understood in Fourier domain. When >> taking the conservation of energy into account, i.e. the conservation >> of the L2 norm of the impulse response function under Fourier >> transformation, it can be normalised in time domain also by setting >> the time-domain L2 norm to unity. (This is already much different >> from the maximum normalisation.) Then the energy content of the >> signal before and after the processing is identical. I.e., it >> emphasises some frequencies in favour of others which are diminished >> in volume. >> >> A better approach might be in my opinion to use the (already directly) >> determined transfer function. This can be normalised by the power of >> the input signal, which can e.g. be determined by a reference >> measurement without any obstruction in defined distance, I would say. >> >> > result = [ sample * 2.0**15 for sample in result ] >> > filtered.writeframes(struct.pack('%dh' % len(result), *result)) >> >> It's to me too a mystery why you observe this octave jump you reported >> on. I guess it's an octave, because you can compensate by doubling >> the sampling rate. Can you check whether your playback program loads >> the output file as single or double channel? Also I guess some >> relation between your observation of "incomplete convolution" and this >> pitch change. >> >> And now, I'm very irritated by the way you handle multiple channels in >> the input file. Actually you load maybe a two-channel input file, >> extract *all* the data, i.e. data from ch1, ch2, ch1, ch2, ..., and >> then convolve this? For single-channel input, this is correct, but >> when your input data is two-channel, it explains on the one hand why >> your program maybe doesn't work properly and on the other why you have >> pitch-halfening (you double the length of each frame). It is I think >> possible that you didn't notice more strange phenomenon, since the >> impulse response function is applied to each datum individually. >> >> > i had a look to what you sent me i m on my way understanding maybe your >> > initialisation tests will allow me to make difference between every wav >> > formats? >> >> Exactly! It should be able to decode at least most wavs, with >> different samp widhts and different channel number. But I think I >> will myself in future hold to David's advice ... >> >> > i want to be able to encode every formats (16bit unsigned, 32bits)what >> > precautions do i have to respect in the filtering?do filter and original >> > must be the same or? >> > Thank you >> >> All problems would go away when you use the transfer function >> directly. It should be present as some function, which can easily be >> interpolated to the frequency points your FFT of the input signal >> yields. Interpolating the time-domain impulse response is not a good >> idea, since it assumes already frequency-boundedness, which is not >> necessarily fulfilled, I guess. Also it's much more complicated. >> >> When you measure transfer function, how can you reconstruct a unique >> impulse response? Do you measure also phases? When you shine in with >> white noise, do autocorrelation of the result and Fourier transform, >> the phases of the transfer function are lost, you only obtain >> amplitudes (squared). Of course it is the same to multiply in Fourier >> space with the transfer function as to convolve with its Fourier-back >> transform, but I'm not yet convinced that this is a good idea to do it >> in time domain tough ... Can you maybe give some hint? >> >> This means you essentially convolve with the autocorrelation function >> itself - but this one is symmetric and therefore not causal. I think >> it's strange when sound starts before it starts in the input ... the >> impulse response must be causal. Anyway, this problem remains also >> when you multiply with the plain real numbers in Fourier domain which >> result from the Fourier transform of the autocorrelation function - >> it's still acausal. The only way to circumvent this I'm seeing >> currently is to measure the phases of the transfer function too - i.e. >> to do a frequency sweep, white noise autocorrelation is then as far as >> I think not sufficient. >> >> Sorry, I feel it didn't came out very clear ... >> >> Friedrich >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent at vincentdavis.net Thu May 27 11:40:13 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Thu, 27 May 2010 09:40:13 -0600 Subject: [Numpy-discussion] How to distinguish between number and string dypes In-Reply-To: References: <201005270927.25802.faltet@pytables.org> Message-ID: On Thu, May 27, 2010 at 8:39 AM, Keith Goodman wrote: > To see if it is a number could you use something like: > > np.issubdtype(a.dtype, float) or np.issubdtype(a.dtype, int) or > np.issubdtype(a.dtype, complex) > > And for string: > > np.issubdtype(a.dtype, str) > These are valid but what I don't like is that I need to know the list of possible number types. Basically I don't like a test that fails because I didn't know about a dtype. For string It is ok, the universe of is either string or not string. Maybe this is as good as it gets. I guess my use case is that I want to be sure I can perform math on the values. So maybe I should just do someting like "numpy.lib._iotools._is_string_like" but "_is_number_like", Maybe there is such and I missed it. If not there should be. Vincent _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > *Vincent Davis 720-301-3003 * vincent at vincentdavis.net my blog | LinkedIn -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 27 11:59:13 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 May 2010 11:59:13 -0400 Subject: [Numpy-discussion] NotImplemented returns Message-ID: A while ago we had a brief discussion about this. Is this a feature? or should there be a ticket for this >>> np.sqrt('5') NotImplemented >>> a = np.sqrt('5') >>> a NotImplemented >>> type(a) Josef From robert.kern at gmail.com Thu May 27 12:03:28 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 May 2010 12:03:28 -0400 Subject: [Numpy-discussion] How to distinguish between number and string dypes In-Reply-To: References: <201005270927.25802.faltet@pytables.org> Message-ID: On Thu, May 27, 2010 at 11:40, Vincent Davis wrote: > > On Thu, May 27, 2010 at 8:39 AM, Keith Goodman wrote: >> >> To see if it is a number could you use something like: >> np.issubdtype(a.dtype, float) or np.issubdtype(a.dtype, int) or >> np.issubdtype(a.dtype, complex) >> >> And for string: >> >> np.issubdtype(a.dtype, str) > > These are valid but what I don't like is that I need to know the list of possible number types. Basically I don't like a test that fails because I didn't know about a dtype. For string It is ok, the universe of is either string or not string. Maybe this is as good as it gets. The dtypes have a hierarchy. In [2]: np.issubdtype(float, np.number) Out[2]: True In [3]: np.issubdtype(str, np.number) Out[3]: False -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ?-- Umberto Eco From charlesr.harris at gmail.com Thu May 27 12:18:33 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 May 2010 10:18:33 -0600 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 8:14 AM, Pauli Virtanen wrote: > Wed, 26 May 2010 07:15:08 -0600, Charles R Harris wrote: > > On Wed, May 26, 2010 at 2:59 AM, Pauli Virtanen wrote: > > > >> Wed, 26 May 2010 06:57:27 +0900, David Cournapeau wrote: [clip: > >> doxygen] > >> > It is yet another format to use inside C sources (I don't think > >> > doxygen supports rest), and I would rather have something that is > >> > similar, ideally integrated into sphinx. It also generates rather > >> > ugly doc by default, > >> > >> Anyway, we can probably nevertheless just agree on a readable > >> plain-text/ rst format, and then just use doxygen to generate the docs, > >> as a band-aid. > >> > >> http://github.com/pv/numpycdoc > > > > Neat. I didn't quite see the how how you connected the rst documentation > > and doxygen. > > I didn't :) > > But I just did: doing this it was actually a 10 min job since Doxygen > accepts HTML -- now it parses the comments as RST and renders it properly > as HTML in the Doxygen output. Of course getting links etc. to work would > require more effort, but that's left as an exercise for someone else to > finish. > > Why don't you go ahead and merge this. If someone wants to substitute something else for doxygen at some point, then that is still open, meanwhile we can get started on writing some cdocs. In particular, it would be nice if the folks doing the code refactoring also documented any new functions. We can also put together a numpycdoc standard to go with it. I think your idea of combining the standard numpy doc format with the usual c code comment style is the way to go. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 27 12:21:14 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 May 2010 10:21:14 -0600 Subject: [Numpy-discussion] NotImplemented returns In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 9:59 AM, wrote: > A while ago we had a brief discussion about this. > > > Is this a feature? or should there be a ticket for this > > >>> np.sqrt('5') > NotImplemented > >>> a = np.sqrt('5') > >>> a > NotImplemented > >>> type(a) > > > Maybe an enhancement ticket. The NotImplemented return is appropriate for some functions, but for functions with a single argument we should probably raise an error. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Thu May 27 12:25:39 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 27 May 2010 11:25:39 -0500 Subject: [Numpy-discussion] How to distinguish between number and string dypes In-Reply-To: References: <201005270927.25802.faltet@pytables.org> Message-ID: <4BFE9D03.7010002@gmail.com> On 05/27/2010 10:40 AM, Vincent Davis wrote: > On Thu, May 27, 2010 at 8:39 AM, Keith Goodman > wrote: > > To see if it is a number could you use something like: > > np.issubdtype(a.dtype, float) or np.issubdtype(a.dtype, int) or > np.issubdtype(a.dtype, complex) > > And for string: > > np.issubdtype(a.dtype, str) > > > These are valid but what I don't like is that I need to know the list > of possible number types. Basically I don't like a test that fails > because I didn't know about a dtype. For string It is ok, the universe > of is either string or not string. Maybe this is as good as it gets. > > I guess my use case is that I want to be sure I can perform math on > the values. So maybe I should just do someting like > "numpy.lib._iotools._is_string_like" but "_is_number_like", Maybe > there is such and I missed it. If not there should be. > > Vincent > > > Can you give an example of what you are trying to do? If some of your string arrays only have string representations of numbers that you want to do the math on then you have to attempt to convert those arrays into a numeric dtype (probably float) using for example asarray(). Bruce >>> import numpy as np >>> a=np.array([1,2,3]) >>> c=np.array(['1','2','3']) >>> d=np.array(['a','b','1']) >>> np.asarray(a, dtype=float) array([ 1., 2., 3.]) >>> np.asarray(c,dtype=float) array([ 1., 2., 3.]) >>> np.asarray(d,dtype=float) Traceback (most recent call last): File "", line 1, in File "/usr/lib64/python2.6/site-packages/numpy/core/numeric.py", line 284, in asarray return array(a, dtype, copy=False, order=order) ValueError: invalid literal for float(): a >>> try: ... np.asarray(d,dtype=float) ... except: ... print 'fail' ... fail -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu May 27 12:57:54 2010 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 27 May 2010 16:57:54 +0000 (UTC) Subject: [Numpy-discussion] NotImplemented returns References: Message-ID: Thu, 27 May 2010 10:21:14 -0600, Charles R Harris wrote: [clip] > Maybe an enhancement ticket. The NotImplemented return is appropriate > for some functions, but for functions with a single argument we should > probably raise an error. A NotImplemented value leaking to the the user is a bug. The function should raise a ValueError instead. NotImplemented is meant only for use as a placeholder singleton in implementation of rich comparison operations etc., and we shouldn't introduce any new meanings IMHO. -- Pauli Virtanen From arthurdeconihout at gmail.com Thu May 27 12:58:35 2010 From: arthurdeconihout at gmail.com (arthur de conihout) Date: Thu, 27 May 2010 18:58:35 +0200 Subject: [Numpy-discussion] Help Convolution with binaural filters(HRTFs) In-Reply-To: References: Message-ID: Thank you for the answer i would love trying audiolab but i got this error while importing i m running python2.6 >>> import audiolab /usr/local/lib/python2.6/dist-packages/audiolab-0.0.0-py2.6-linux-i686.egg/audiolab/soundio/play.py:48: UserWarning: Could not import alsa backend; most probably, you did not have alsa headers when building audiolab >>> import scikits.audiolab Traceback (most recent call last): File "", line 1, in File "scikits/audiolab/__init__.py", line 25, in from pysndfile import formatinfo, sndfile File "scikits/audiolab/pysndfile/__init__.py", line 1, in from _sndfile import Sndfile, Format, available_file_formats, available_encodings *ImportError: No module named _sndfile* Thank you AdeC 2010/5/27 David Cournapeau > On Thu, May 27, 2010 at 8:00 PM, arthur de conihout > wrote: > > > #i dont really understand the %dh and the s/2.0**15 but it might be my > > problem > > original = [s / 2.0**15 for s in original] > > This is done because wav file are (usually, not always) in fixed > point, with values in the unsigned 16 bits int range (~ [-32768, > 32768]), but when you want to do processing in floating point (as does > numpy), you want the values normalized (in the [-1, 1] range). 2 * 15 > gives you the normalization factor. But audiolab does this for you > automatically, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent at vincentdavis.net Thu May 27 13:08:19 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Thu, 27 May 2010 11:08:19 -0600 Subject: [Numpy-discussion] How to distinguish between number and string dypes In-Reply-To: <4BFE9D03.7010002@gmail.com> References: <201005270927.25802.faltet@pytables.org> <4BFE9D03.7010002@gmail.com> Message-ID: On Thu, May 27, 2010 at 10:25 AM, Bruce Southey wrote: > On 05/27/2010 10:40 AM, Vincent Davis wrote: > Can you give an example of what you are trying to do? > arr = np.array([(1,'a'),(2,'b')], dtype =[(num,int),(str, |s2)] No supposed I want to know if I can sum the values in 'num'. I could just try and then handle the exemption, but I would like to do something more like for col in arr.dtypes.names: if arr[col] "is a number": sum(arr[col]) I think i can use Roberts suggestion, I was not aware of np.number, I guess I need to look into the hierarchy more. The dtypes have a hierarchy. > > In [2]: np.issubdtype(float, np.number) > Out[2]: True > > In [3]: np.issubdtype(str, np.number) > Out[3]: False > > -- > Robert Kern Thanks Vincent > > > If some of your string arrays only have string representations of numbers > that you want to do the math on then you have to attempt to convert those > arrays into a numeric dtype (probably float) using for example asarray(). > > Bruce > > >>> import numpy as np > >>> a=np.array([1,2,3]) > >>> c=np.array(['1','2','3']) > >>> d=np.array(['a','b','1']) > >>> np.asarray(a, dtype=float) > array([ 1., 2., 3.]) > >>> np.asarray(c,dtype=float) > array([ 1., 2., 3.]) > >>> np.asarray(d,dtype=float) > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib64/python2.6/site-packages/numpy/core/numeric.py", line > 284, in asarray > return array(a, dtype, copy=False, order=order) > ValueError: invalid literal for float(): a > >>> try: > ... np.asarray(d,dtype=float) > ... except: > ... print 'fail' > ... > fail > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > *Vincent Davis 720-301-3003 * vincent at vincentdavis.net my blog | LinkedIn -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 27 13:13:27 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 May 2010 13:13:27 -0400 Subject: [Numpy-discussion] NotImplemented returns In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 12:57 PM, Pauli Virtanen wrote: > Thu, 27 May 2010 10:21:14 -0600, Charles R Harris wrote: > [clip] >> Maybe an enhancement ticket. The NotImplemented return is appropriate >> for some functions, but for functions with a single argument we should >> probably raise an error. > > A NotImplemented value leaking to the the user is a bug. The function > should raise a ValueError instead. > > NotImplemented is meant only for use as a placeholder singleton in > implementation of rich comparison operations etc., and we shouldn't > introduce any new meanings IMHO. http://projects.scipy.org/numpy/ticket/1494 with a few examples I don't know which component Josef > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cekees at gmail.com Thu May 27 13:39:29 2010 From: cekees at gmail.com (Chris Kees) Date: Thu, 27 May 2010 12:39:29 -0500 Subject: [Numpy-discussion] error during config on Ubuntu powerpc64, numpy-1.4.1 Message-ID: Hi, I'm getting an error in check_long_double_representation on a linux/powerpc64 box. Has anybody seen this before/know a fix? -Chris > python -V Python 2.6.5 > uname -a Linux chl-29-200 2.6.32-21-powerpc64-smp #32-Ubuntu SMP Fri Apr 16 10:28:57 UTC 2010 ppc64 GNU/Linux >python setup.py install ... C compiler: gcc -m32 -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/home/cekees/src/proteus/linux /include/python2.6 -c' gcc: _configtest.c removing: _configtest.c _configtest.o Traceback (most recent call last): File "setup.py", line 187, in setup_package() File "setup.py", line 180, in setup_package configuration=configuration ) File "/home/cekees/src/proteus/externalPackages/numpy-1.4.1/numpy/distutils/core.py", line 186, in setup return old_setup(**new_attr) File "/home/cekees/src/proteus/linux/lib/python2.6/distutils/core.py", line 152, in setup dist.run_commands() File "/home/cekees/src/proteus/linux/lib/python2.6/distutils/dist.py", line 975, in run_commands self.run_command(cmd) File "/home/cekees/src/proteus/linux/lib/python2.6/distutils/dist.py", line 995, in run_command cmd_obj.run() File "/home/cekees/src/proteus/externalPackages/numpy-1.4.1/numpy/distutils/command/build.py", line 37, in run old_build.run(self) File "/home/cekees/src/proteus/linux/lib/python2.6/distutils/command/build.py", line 134, in run self.run_command(cmd_name) File "/home/cekees/src/proteus/linux/lib/python2.6/distutils/cmd.py", line 333, in run_command self.distribution.run_command(command) File "/home/cekees/src/proteus/linux/lib/python2.6/distutils/dist.py", line 995, in run_command cmd_obj.run() File "/home/cekees/src/proteus/externalPackages/numpy-1.4.1/numpy/distutils/command/build_src.py", line 152, in run self.build_sources() File "/home/cekees/src/proteus/externalPackages/numpy-1.4.1/numpy/distutils/command/build_src.py", line 169, in build_sources self.build_extension_sources(ext) File "/home/cekees/src/proteus/externalPackages/numpy-1.4.1/numpy/distutils/command/build_src.py", line 328, in build_extension_sources sources = self.generate_sources(sources, ext) File "/home/cekees/src/proteus/externalPackages/numpy-1.4.1/numpy/distutils/command/build_src.py", line 385, in generate_sources source = func(extension, build_dir) File "numpy/core/setup.py", line 413, in generate_config_h rep = check_long_double_representation(config_cmd) File "numpy/core/setup_common.py", line 136, in check_long_double_representation type = long_double_representation(pyod(object)) File "numpy/core/setup_common.py", line 244, in long_double_representation raise ValueError("Unrecognized format (%s)" % saw) ValueError: Unrecognized format (['001', '043', '105', '147', '211', '253', '315', '357', '301', '235', '157', '064', '124', '000', '000', '000', '000', '000', '000', '000', '000', '000', '000', '000', '376', '334', '272', '230', '166', '124', '062', '020']) -------------- next part -------------- An HTML attachment was scrubbed... URL: From justin.t.riley at gmail.com Thu May 27 13:53:31 2010 From: justin.t.riley at gmail.com (Justin Riley) Date: Thu, 27 May 2010 13:53:31 -0400 Subject: [Numpy-discussion] StarCluster 0.91 - NumPy/SciPy Clusters on EC2 In-Reply-To: References: Message-ID: <4BFEB19B.5040804@gmail.com> This is a one-time message to announce the availability of version 0.91 of the StarCluster package. Why should you care? StarCluster allows you to create NumPy/SciPy clusters configured with NFS-shared filesystems and the Sun Grid Engine queueing system out of the box on Amazon's Elastic Compute Cloud (EC2). The NumPy/SciPy installations have been compiled against a custom-compiled ATLAS for the larger EC2 instances. About ----- There is an article about StarCluster on www.hpcinthecloud.com: http://www.hpcinthecloud.com/features/StarCluster-Brings-HPC-to-the-Amazon-Cloud-94099324.html There is also a screencast of installing, configuring, launching, and terminating an HPC cluster on Amazon EC2: http://www.hpcinthecloud.com/blogs/MITs-StarCluster-An-Update-with-Screencast-94599554.html Project description from PyPI: StarCluster is a utility for creating and managing scientific computing clusters hosted on Amazon's Elastic Compute Cloud (EC2). StarCluster utilizes Amazon's EC2 web service to create and destroy clusters of Linux virtual machines on demand. To get started, the user creates a simple configuration file with their AWS account details and a few cluster preferences (e.g. number of machines, machine type, ssh keypairs, etc). After creating the configuration file and running StarCluster's "start" command, a cluster of Linux machines configured with the Sun Grid Engine queuing system, an NFS-shared /home directory, and OpenMPI with password-less ssh is created and ready to go out-of-the-box. Running StarCluster's "stop" command will shutdown the cluster and stop paying for service. This allows the user to only pay for what they use. StarCluster provides a Ubuntu-based Amazon Machine Image (AMI) in 32bit and 64bit architectures. The AMI contains an optimized NumPy/SciPy/Atlas/Blas/Lapack installation compiled for the larger Amazon EC2 instance types. The AMI also comes with Sun Grid Engine (SGE) and OpenMPI compiled with SGE support. The public AMI can easily be customized by launching a single instance of the public AMI, installing additional software on the instance, and then using StarCluster can also utilize Amazon's Elastic Block Storage (EBS) volumes to provide persistent data storage for a cluster. EBS volumes allow you to store large amounts of data in the Amazon cloud and are also easy to back-up and replicate in the cloud. StarCluster will mount and NFS-share any volumes specified in the config. StarCluster's "createvolume" command provides the ability to automatically create, format, and partition new EBS volumes for use with StarCluster. Download -------- StarCluster is available on PyPI (http://pypi.python.org/pypi/StarCluster) and also on the project's website: http://web.mit.edu/starcluster You will find the docs as well as links to the StarCluster mailing list on the website. New in this version: -------------------- * support for launching and managing multiple clusters on EC2 * added "listclusters" command for showing all active clusters on EC2 * support for attaching and NFS-sharing multiple EBS volumes * added createimage and createvolume commands for easily creating new AMIs and EBS volumes for use with StarCluster * experimental support for launching clusters using spot instances * added support for StarCluster "plugins" that provide the ability to perform additional configuration/setup routines on top of StarCluster's default cluster configuration * added "listpublic" command for listing all available public StarCluser AMIs that can be used with StarCluster * bash/zsh command line completion for StarCluster's command line interface From friedrichromstedt at gmail.com Thu May 27 14:07:43 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Thu, 27 May 2010 20:07:43 +0200 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: <4BFE0311.20903@silveregg.co.jp> Message-ID: I just want to say that I used Git on Windows without any problem using a minGW built Git, i.e. msysgit: http://code.google.com/p/msysgit/downloads/list The only problem I see is that with CR / CRLF / LF. When one installs msysgit, one can choose what procedure to take - to commit to the repo with windows or unix line endings. I made the mistake and chose windows line endings, and now all my Git repos have dos format ... pity since I switched to Mac now: my git now wants to commit with another ending format, and everything has to be updated - I wonder whether there is some possibility to revert this virtual-nothing change after committing? - But this is a mess easily avoidable by the virtual instruction "Configure your git to commit unix endings!" One more thing, iirc msysgit requires that no mingw is installed already - but when you have mingw you can compile git yourself anyway, or am I wrong? Friedrich From matthew.brett at gmail.com Thu May 27 14:09:00 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 27 May 2010 11:09:00 -0700 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: Hi, > Maybe most importantly, distributed revision control places any > possible contributor on equal footing with those with commit access; > this is one important step in making contributors feel valued. I think this is a very important point, but subtle. I realize that's a dangerous combination, but I'm going to have a go at exposition. I think it is true that the distributed model _tends_ to make contributors feel more welcome, but it's not to do with permissions, it's to do with the process. The process is much more important than the permissions. If we want new contributors to feel welcome, we need a clear, explicit process, that everyone agrees to, and follows. I don't mean something enforced by permissions, but something followed, by convention, and with care, by all the developers. That provides a clear and healthy basis for people to join. In that situation, and in that situation only, new developers do not worry about whether they are clever or important or well-known enough to contribute code. That does tend to follow from the distributed model, because it is fundamentally built on the 'show me the code' model of development. Not surprisingly. I completely agree with Anne that we will work it out when we switch, and the details of process should not delay us. But, this is just a vote for some careful thought - and discussion - and agreement - on what sort of atmosphere we want to convey as a community. That atmosphere comes directly from our development model - or rather - the development model is the clearest indicator of what kind of colleagues we are. Are we careful? Are we serious? Are we thoughtful? Are we open? Are we clear? Do we value learning and teaching? Are we coding for the long-term? See you, Matthew From charlesr.harris at gmail.com Thu May 27 14:34:08 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 May 2010 12:34:08 -0600 Subject: [Numpy-discussion] NotImplemented returns In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 9:59 AM, wrote: > A while ago we had a brief discussion about this. > > > Is this a feature? or should there be a ticket for this > > >>> np.sqrt('5') > NotImplemented > >>> a = np.sqrt('5') > >>> a > NotImplemented > >>> type(a) > > > What numpy version? I get In [2]: sqrt(['a']) --------------------------------------------------------------------------- NotImplementedError Traceback (most recent call last) /home/charris/ in () NotImplementedError: Not implemented for this type In [3]: sqrt('a') --------------------------------------------------------------------------- NotImplementedError Traceback (most recent call last) /home/charris/ in () NotImplementedError: Not implemented for this type Which is entirely different. Note that Py_NotImplemented is *not* only for comparisons, it is a signal to the interpreter to try the r* version of a binary operator. OTOH, In [4]: maximum('a',1) Out[4]: NotImplemented Which is still a problem. I think no ufunc should return NotImplemented, it should be reserved to methods so the interpreter will handle it correctly. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Thu May 27 14:38:22 2010 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Thu, 27 May 2010 20:38:22 +0200 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: <4BFE0311.20903@silveregg.co.jp> Message-ID: 2010/5/27 Friedrich Romstedt : > I just want to say that I used Git on Windows without any problem > using a minGW built Git, i.e. msysgit: Hm, I read the other thread too late to recognise this to be discussed already - Sorry And hey, even Windows has Tab completion of path names, even in the, agreed, terrible, non-PowerShell console. From josef.pktd at gmail.com Thu May 27 14:40:07 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 May 2010 14:40:07 -0400 Subject: [Numpy-discussion] NotImplemented returns In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 2:34 PM, Charles R Harris wrote: > > > On Thu, May 27, 2010 at 9:59 AM, wrote: >> >> A while ago we had a brief discussion about this. >> >> >> Is this a feature? or should there be a ticket for this >> >> >>> np.sqrt('5') >> NotImplemented >> >>> a = np.sqrt('5') >> >>> a >> NotImplemented >> >>> type(a) >> >> > > What numpy version? I get Obviously I'm too old (numpy 1.4.0) Josef > > In [2]: sqrt(['a']) > --------------------------------------------------------------------------- > NotImplementedError?????????????????????? Traceback (most recent call last) > > /home/charris/ in () > > NotImplementedError: Not implemented for this type > > In [3]: sqrt('a') > --------------------------------------------------------------------------- > NotImplementedError?????????????????????? Traceback (most recent call last) > > /home/charris/ in () > > NotImplementedError: Not implemented for this type > > > Which is entirely different. Note that Py_NotImplemented is *not* only for > comparisons, it is a signal to the interpreter to try the r* version of a > binary operator. > > > OTOH, > > In [4]: maximum('a',1) > Out[4]: NotImplemented > > Which is still a problem. I think no ufunc should return NotImplemented, it > should be reserved to methods so the interpreter will handle it correctly. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From d.l.goldsmith at gmail.com Thu May 27 13:36:26 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Thu, 27 May 2010 10:36:26 -0700 Subject: [Numpy-discussion] Extending documentation to c code In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 9:18 AM, Charles R Harris wrote: > > On Wed, May 26, 2010 at 8:14 AM, Pauli Virtanen wrote: > >> Wed, 26 May 2010 07:15:08 -0600, Charles R Harris wrote: >> > On Wed, May 26, 2010 at 2:59 AM, Pauli Virtanen wrote: >> > >> >> Wed, 26 May 2010 06:57:27 +0900, David Cournapeau wrote: [clip: >> >> doxygen] >> >> > It is yet another format to use inside C sources (I don't think >> >> > doxygen supports rest), and I would rather have something that is >> >> > similar, ideally integrated into sphinx. It also generates rather >> >> > ugly doc by default, >> >> >> >> Anyway, we can probably nevertheless just agree on a readable >> >> plain-text/ rst format, and then just use doxygen to generate the docs, >> >> as a band-aid. >> >> >> >> http://github.com/pv/numpycdoc >> > >> > Neat. I didn't quite see the how how you connected the rst documentation >> > and doxygen. >> >> I didn't :) >> >> But I just did: doing this it was actually a 10 min job since Doxygen >> accepts HTML -- now it parses the comments as RST and renders it properly >> as HTML in the Doxygen output. Of course getting links etc. to work would >> require more effort, but that's left as an exercise for someone else to >> finish. >> >> > Why don't you go ahead and merge this. If someone wants to substitute > something else for doxygen at some point, then that is still open, meanwhile > we can get started on writing some cdocs. In particular, it would be nice if > the folks doing the code refactoring also documented any new functions. > Thanks for being a voice for change! :-) > We can also put together a numpycdoc standard to go with it. I think your > idea of combining the standard numpy doc format with the usual c code > comment style is the way to go. > And certainly at this early stage something is better than nothing. DG > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu May 27 18:37:32 2010 From: cournape at gmail.com (David Cournapeau) Date: Fri, 28 May 2010 07:37:32 +0900 Subject: [Numpy-discussion] Help Convolution with binaural filters(HRTFs) In-Reply-To: References: Message-ID: On Fri, May 28, 2010 at 1:58 AM, arthur de conihout wrote: > Thank you for the answer > i would love trying audiolab but i got this error while importing i m > running python2.6 > >>>> import audiolab > /usr/local/lib/python2.6/dist-packages/audiolab-0.0.0-py2.6-linux-i686.egg/audiolab/soundio/play.py:48: > UserWarning: Could not import alsa backend; most probably, you did not have > alsa headers when building audiolab How did you install it ? Your installation looks bogus (if you build it by yourself, please give the *exact* instructions you used) David From josef.pktd at gmail.com Thu May 27 19:19:29 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 May 2010 19:19:29 -0400 Subject: [Numpy-discussion] crash in np.poly Message-ID: I was tracking a test failure/crash in scipy.signal.ltisys numpy 1.4.0: >>> np.poly(np.zeros((0,0))) ** On entry to DGEEV parameter number 5 had an illegal value Josef From charlesr.harris at gmail.com Thu May 27 21:57:08 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 May 2010 19:57:08 -0600 Subject: [Numpy-discussion] crash in np.poly In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 5:19 PM, wrote: > I was tracking a test failure/crash in scipy.signal.ltisys > > numpy 1.4.0: > > >>> np.poly(np.zeros((0,0))) > ** On entry to DGEEV parameter number 5 had an illegal value > > > In current: In [1]: np.poly(np.zeros((0,0))) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/charris/ in () /usr/local/lib/python2.6/dist-packages/numpy/lib/polynomial.pyc in poly(seq_of_zeros) 126 pass 127 else: --> 128 raise ValueError, "input must be 1d or square 2d array." 129 130 if len(seq_of_zeros) == 0: ValueError: input must be 1d or square 2d array. Looks like it could use a better error message in this case, though. Could you open a low priority ticket for this? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu May 27 22:00:22 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 May 2010 20:00:22 -0600 Subject: [Numpy-discussion] crash in np.poly In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 7:57 PM, Charles R Harris wrote: > > > On Thu, May 27, 2010 at 5:19 PM, wrote: > >> I was tracking a test failure/crash in scipy.signal.ltisys >> >> numpy 1.4.0: >> >> >>> np.poly(np.zeros((0,0))) >> ** On entry to DGEEV parameter number 5 had an illegal value >> >> >> In current: > > In [1]: np.poly(np.zeros((0,0))) > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > > /home/charris/ in () > > /usr/local/lib/python2.6/dist-packages/numpy/lib/polynomial.pyc in > poly(seq_of_zeros) > 126 pass > 127 else: > --> 128 raise ValueError, "input must be 1d or square 2d array." > 129 > 130 if len(seq_of_zeros) == 0: > > ValueError: input must be 1d or square 2d array. > > > Looks like it could use a better error message in this case, though. Could > you open a low priority ticket for this? > > The new Polynomial class does somewhat better: In [6]: Poly(zeros((0,0))) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/charris/ in () /usr/local/lib/python2.6/dist-packages/numpy/polynomial/polynomial.pyc in __init__(self, coef, domain) /usr/local/lib/python2.6/dist-packages/numpy/polynomial/polyutils.pyc in as_series(alist, trim) 156 arrays = [np.array(a, ndmin=1, copy=0) for a in alist] 157 if min([a.size for a in arrays]) == 0 : --> 158 raise ValueError("Coefficient array is empty") 159 if any([a.ndim != 1 for a in arrays]) : 160 raise ValueError("Coefficient array is not 1-d") ValueError: Coefficient array is empty Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 27 22:14:19 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 May 2010 22:14:19 -0400 Subject: [Numpy-discussion] crash in np.poly In-Reply-To: References: Message-ID: On Thu, May 27, 2010 at 10:00 PM, Charles R Harris wrote: > > > On Thu, May 27, 2010 at 7:57 PM, Charles R Harris > wrote: >> >> >> On Thu, May 27, 2010 at 5:19 PM, wrote: >>> >>> I was tracking a test failure/crash in scipy.signal.ltisys >>> >>> numpy 1.4.0: >>> >>> >>> np.poly(np.zeros((0,0))) >>> ?** On entry to DGEEV ?parameter number ?5 had an illegal value >>> >>> >> In current: >> >> In [1]: np.poly(np.zeros((0,0))) >> >> --------------------------------------------------------------------------- >> ValueError??????????????????????????????? Traceback (most recent call >> last) >> >> /home/charris/ in () >> >> /usr/local/lib/python2.6/dist-packages/numpy/lib/polynomial.pyc in >> poly(seq_of_zeros) >> ??? 126???????? pass >> ??? 127???? else: >> --> 128???????? raise ValueError, "input must be 1d or square 2d array." >> ??? 129 >> ??? 130???? if len(seq_of_zeros) == 0: >> >> ValueError: input must be 1d or square 2d array. >> >> >> Looks like it could use a better error message in this case, though. Could >> you open a low priority ticket for this? >> > > The new Polynomial class does somewhat better: > > In [6]: Poly(zeros((0,0))) > --------------------------------------------------------------------------- > ValueError??????????????????????????????? Traceback (most recent call last) > > /home/charris/ in () > > /usr/local/lib/python2.6/dist-packages/numpy/polynomial/polynomial.pyc in > __init__(self, coef, domain) > > /usr/local/lib/python2.6/dist-packages/numpy/polynomial/polyutils.pyc in > as_series(alist, trim) > ??? 156???? arrays = [np.array(a, ndmin=1, copy=0) for a in alist] > ??? 157???? if min([a.size for a in arrays]) == 0 : > --> 158???????? raise ValueError("Coefficient array is empty") > ??? 159???? if any([a.ndim != 1 for a in arrays]) : > ??? 160???????? raise ValueError("Coefficient array is not 1-d") > > ValueError: Coefficient array is empty > > Chuck http://projects.scipy.org/numpy/ticket/1495 Thanks, Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From ndbecker2 at gmail.com Fri May 28 09:24:49 2010 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 28 May 2010 09:24:49 -0400 Subject: [Numpy-discussion] curious about how people would feel about moving to github References: Message-ID: I prefer python, so I prefer mercurial From dagss at student.matnat.uio.no Fri May 28 09:28:30 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 May 2010 15:28:30 +0200 Subject: [Numpy-discussion] curious about how people would feel about moving to github In-Reply-To: References: Message-ID: <4BFFC4FE.60305@student.matnat.uio.no> Neal Becker wrote: > I prefer python, so I prefer mercurial > > http://hg-git.github.com/ Dag Sverre From oliphant at enthought.com Fri May 28 13:13:57 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Fri, 28 May 2010 12:13:57 -0500 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On May 27, 2010, at 6:51 AM, Ralf Gommers wrote: > > > On Wed, May 26, 2010 at 12:23 PM, Travis Oliphant > wrote: > > On May 25, 2010, at 5:06 PM, David Cournapeau wrote: > > > On Wed, May 26, 2010 at 6:19 AM, Charles R Harris > > wrote: > > > >> Sounds good, but what if it doesn't get finished in a few months? > I think we > >> should get 2.0.0 out pronto, ideally it would already have been > released. I > >> think a major refactoring like this proposal should get the 3.0.0 > label. > > > > Naming it 3.0 or 2.1 does not matter much - I think we should avoid > > breaking things twice. I can see a few solutions: > > - postpone 2.0 "indefinitely", until this new work is done > > - backport py3k changes to 1.5 (which would be API and ABI > > compatible with 1.4.1), and 2.0 would contain all the breaking > > changes. > > This is an interesting idea and also workable. > > > > > I am really worried about breaking things once now and once in a few > > months (or even a year). > > I am too. That's why this discussion. We will have the NumPy > refactor done by end of July at the latest. Numpy 2.0 should be > able to come out in August. > > This thread got a bit side-tracked with the move to git, but I don't > see a conclusion about what to release when. > > Even if the refactoring is done in July, I think a 2.0 release with > so many major changes will probably need a longer test/release > cycle. So if we say September, do you still want a 1.5 release? I think this makes sense so that we can reschedule NumPy 2.0 for September and still provide a release with the Python 3k changes (I am assuming these can be done in an ABI-compatible way). -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From sierra_mtnview at sbcglobal.net Fri May 28 14:36:04 2010 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 28 May 2010 11:36:04 -0700 Subject: [Numpy-discussion] Gauss-Newton Method in Python? Message-ID: <4C000D14.1060802@sbcglobal.net> Is Subject method available in Python? -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet There are no statues or memorials dedicated to Thomas Paine for his substantial part in the American Revolution. -- An observation in The Science of Liberty by Timoth Ferris Web Page: From arthurdeconihout at gmail.com Fri May 28 14:42:02 2010 From: arthurdeconihout at gmail.com (arthur de conihout) Date: Fri, 28 May 2010 20:42:02 +0200 Subject: [Numpy-discussion] Installing audiolab on python 2.6 Message-ID: Hi I reinstall it on a different machine i got the same error with the nsdfile no warning anymore with audiolab . I followed these points: Requirements? audiolab requires the following softwares: - a python interpreter. - libsndfile - numpy (any version >= 1.2 should work). - setuptools On Ubuntu, you can install the dependencies as follow: sudo apt-get install python-dev python-numpy python-setuptools libsndfile-dev Optional? Audiolab can optionally install audio backends. For now, only alsa (Linux) and Core Audio (Mac OS X) are supported. On Linux, you need alsa headers for this to work; on Ubuntu, you can install them with the following command: sudo apt-get install libasound2-dev For Mac OS X, you need the CoreAudio framework, available on the Apple website. Build?For unix users, if libsndfile is installed in standart location (eg /usr/lib, /usr/local/lib), the installer should be able to find them automatically, and you only need to do a ?python setup.py install? Concerning the eventual site.cfg file i have no idea what to do with this it might be the problem maybe you can give advice? Thank you 2010/5/28 David Cournapeau > On Fri, May 28, 2010 at 1:58 AM, arthur de conihout > wrote: > > Thank you for the answer > > i would love trying audiolab but i got this error while importing i m > > running python2.6 > > > >>>> import audiolab > > > /usr/local/lib/python2.6/dist-packages/audiolab-0.0.0-py2.6-linux-i686.egg/audiolab/soundio/play.py:48: > > UserWarning: Could not import alsa backend; most probably, you did not > have > > alsa headers when building audiolab > > How did you install it ? Your installation looks bogus (if you build > it by yourself, please give the *exact* instructions you used) > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arthurdeconihout at gmail.com Fri May 28 15:02:44 2010 From: arthurdeconihout at gmail.com (arthur de conihout) Date: Fri, 28 May 2010 21:02:44 +0200 Subject: [Numpy-discussion] Installing audiolab on python 2.6 In-Reply-To: References: Message-ID: *ok problem solved i close the topic sorry i just had to reinstall libnsdfile correctly thanks* 2010/5/28 arthur de conihout > Hi > I reinstall it on a different machine i got the same error with the nsdfile > no warning anymore with audiolab . > I followed these points: > > Requirements? > > audiolab requires the following softwares: > > > - a python interpreter. > - libsndfile > - numpy (any version >= 1.2 should work). > - setuptools > > On Ubuntu, you can install the dependencies as follow: > > sudo apt-get install python-dev python-numpy python-setuptools libsndfile-dev > > Optional? > > Audiolab can optionally install audio backends. For now, only alsa (Linux) > and Core Audio (Mac OS X) are supported. On Linux, you need alsa headers for > this to work; on Ubuntu, you can install them with the following command: > > sudo apt-get install libasound2-dev > > For Mac OS X, you need the CoreAudio framework, available on the Apple > website. > Build?For unix users, if libsndfile is installed in standart location (eg > /usr/lib, /usr/local/lib), the installer should be able to find them > automatically, and you only need to do a ?python setup.py install? > > Concerning the eventual site.cfg file i have no idea what to do with this > it might be the problem maybe you can give advice? > > Thank you > > > > 2010/5/28 David Cournapeau > >> On Fri, May 28, 2010 at 1:58 AM, arthur de conihout >> wrote: >> > Thank you for the answer >> > i would love trying audiolab but i got this error while importing i m >> > running python2.6 >> > >> >>>> import audiolab >> > >> /usr/local/lib/python2.6/dist-packages/audiolab-0.0.0-py2.6-linux-i686.egg/audiolab/soundio/play.py:48: >> > UserWarning: Could not import alsa backend; most probably, you did not >> have >> > alsa headers when building audiolab >> >> How did you install it ? Your installation looks bogus (if you build >> it by yourself, please give the *exact* instructions you used) >> >> David >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri May 28 15:09:33 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 May 2010 13:09:33 -0600 Subject: [Numpy-discussion] Gauss-Newton Method in Python? In-Reply-To: <4C000D14.1060802@sbcglobal.net> References: <4C000D14.1060802@sbcglobal.net> Message-ID: What problem are you trying to solve. The leastsq algorithm in scipy is effectively Gauss-Newton when that is appropriate to the problem. Chuck On Fri, May 28, 2010 at 12:36 PM, Wayne Watson wrote: > Is Subject method available in Python? > > -- > Wayne Watson (Watson Adventures, Prop., Nevada City, CA) > > (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) > Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet > > There are no statues or memorials dedicated to > Thomas Paine for his substantial part in the > American Revolution. > > -- An observation in The Science of Liberty > by Timoth Ferris > > > Web Page: > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sierra_mtnview at sbcglobal.net Fri May 28 18:21:04 2010 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 28 May 2010 15:21:04 -0700 Subject: [Numpy-discussion] Gauss-Newton Method in Python? In-Reply-To: References: <4C000D14.1060802@sbcglobal.net> Message-ID: <4C0041D0.9020806@sbcglobal.net> An HTML attachment was scrubbed... URL: From xavier.gnata at gmail.com Fri May 28 18:46:52 2010 From: xavier.gnata at gmail.com (Xavier Gnata) Date: Sat, 29 May 2010 00:46:52 +0200 Subject: [Numpy-discussion] Gauss-Newton Method in Python? In-Reply-To: <4C0041D0.9020806@sbcglobal.net> References: <4C000D14.1060802@sbcglobal.net> <4C0041D0.9020806@sbcglobal.net> Message-ID: <4C0047DC.6000601@gmail.com> Let (Xi,Yi) be the positions of your stars on the sky. i in the 1 to N range. Let (Xj,Yj) be the positions of your stars images (PSF) on your picture. i in the 1 to N range. You can parametrize the distortion this way: Xj_param = Px(Xi,Yi) Yj_param = Py(Xi,Yi) where Px and Py are the two polynomials minimizing the RMS residuals in between the parametrized positions and the measured one on the picture. As the distortion is a smooth function over the field of view, low order polynomials are sufficient (order 3 or 5) to get very low RMS residuals. Pros : No optimization algorithm need. You can compute the polynomials coefficients using* pseudoinverse.* There is no question about global versus local minimum anymore. Only* pseudoinverses. Px qnd Py provides you with a full description of your system optical distortion. *Cons : I don't know. Once you have the polynomials, you can compute whatever other distortion parametrization you may prefer. Xavier > I'm doing plate reduction on astro photos. There's non-linearity in > the lens. Basically, one is trying to estimate several lens > parameters by look at a field of known stars versus ones measured on a > photo plate. The author states it can be solved by taking first > derivatives to linearize matters, and iteratively apply least squares > until the change in parameters falls below some limits. Gauss-Newton > seems a bit different in that it tries to minimize the sum of squares. > > In a follow up paper, he refers to the process as a gradient > method. Up until then, my best guess was G-N. I suspect that you are > hinting at the Gradient plus LSQ (least squares). However, out of > curiosity, isn't their a library of optimization methods like > Marquardt or Davidon? > > On 5/28/2010 12:09 PM, Charles R Harris wrote: >> What problem are you trying to solve. The leastsq algorithm in scipy >> is effectively Gauss-Newton when that is appropriate to the problem. >> >> Chuck >> >> On Fri, May 28, 2010 at 12:36 PM, Wayne Watson >> > >> wrote: >> >> Is Subject method available in Python? >> >> -- >> Wayne Watson (Watson Adventures, Prop., Nevada City, CA) >> >> (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) >> Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet >> >> There are no statues or memorials dedicated to >> Thomas Paine for his substantial part in the >> American Revolution. >> >> -- An observation in The Science of Liberty >> by Timoth Ferris >> >> >> Web Page:> > >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -- > Wayne Watson (Watson Adventures, Prop., Nevada City, CA) > > (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) > Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet > > There are no statues or memorials dedicated to > Thomas Paine for his substantial part in the > American Revolution. > > -- An observation in The Science of Liberty > by Timoth Ferris > > > Web Page: > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sierra_mtnview at sbcglobal.net Fri May 28 19:45:21 2010 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 28 May 2010 16:45:21 -0700 Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? Message-ID: <4C005591.1060900@sbcglobal.net> Suppose I have a 640x480 pixel video chip and would like to find star images on it, possible planets and the moon. A possibility of noise exits, or bright pixels. Is there a known method for finding the centroids of these astro objects? -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet There are no statues or memorials dedicated to Thomas Paine for his substantial part in the American Revolution. -- An observation in The Science of Liberty by Timoth Ferris Web Page: From charlesr.harris at gmail.com Fri May 28 20:09:55 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 May 2010 18:09:55 -0600 Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? In-Reply-To: <4C005591.1060900@sbcglobal.net> References: <4C005591.1060900@sbcglobal.net> Message-ID: On Fri, May 28, 2010 at 5:45 PM, Wayne Watson wrote: > Suppose I have a 640x480 pixel video chip and would like to find star > images on it, possible planets and the moon. A possibility of noise > exits, or bright pixels. Is there a known method for finding the > centroids of these astro objects? > > You can threshold the image and then cluster the pixels in objects. I've done this on occasion using my own software, but I think there might be something in scipy/ndimage that does the same. Someone here will know. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aarchiba at physics.mcgill.ca Fri May 28 20:41:43 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Fri, 28 May 2010 21:41:43 -0300 Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? In-Reply-To: References: <4C005591.1060900@sbcglobal.net> Message-ID: On 28 May 2010 21:09, Charles R Harris wrote: > > > On Fri, May 28, 2010 at 5:45 PM, Wayne Watson > wrote: >> >> Suppose I have a 640x480 pixel video chip and would like to find star >> images on it, possible planets and the moon. A possibility of noise >> exits, or bright pixels. Is there a known method for finding the >> centroids of these astro objects? >> > > You can threshold the image and then cluster the pixels in objects. I've > done this on occasion using my own software, but I think there might be > something in scipy/ndimage that does the same. Someone here will know. There are sort of two passes here - the first is to find all the stars, and the second is to fine down their positions, ideally to less than a pixel. For the former, thresholding and clumping is probably the way to go. For the latter I think a standard approach is PSF fitting - that is, you fit (say) a two-dimensional Gaussian to the pixels near your star. You'll fit for at least central (subpixel) position, probably radius, and maybe eccentricity and orientation. You might even fit for a more sophisticated PSF (doughnuts are natural for Schmidt-Cassegrain telescopes, or the diffraction pattern of your spider). Any spot whose best-fit PSF is just one pixel wide is noise or a cosmic ray hit or a hotpixel; any spot whose best-fit PSF is huge is a detector splodge or a planet or galaxy. All this assumes that your CCD has more resolution than your optics; if this is not the case you're more or less stuck, since a star is then just a bright pixel. In this case your problem is one of combining multiple offset images, dark skies, and dome flats to try to distinguish detector crud and cosmic ray hits from actual stars. It can be done, but it will be a colossal pain if your pointing accuracy is not subpixel (which it probably won't be). In any case, my optical friends tell me that the Right Way to do all this is to use all the code built into IRAF (or its python wrapper, pyraf) that does all this difficult work for you. Anne P.S. if your images have been fed through JPEG or some other lossy compression the process will become an utter nightmare. -A > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From pav at iki.fi Fri May 28 21:05:04 2010 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 29 May 2010 01:05:04 +0000 (UTC) Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: Fri, 28 May 2010 12:13:57 -0500, Travis Oliphant wrote: [clip] > I think this makes sense so that we can reschedule NumPy 2.0 for > September and still provide a release with the Python 3k changes (I am > assuming these can be done in an ABI-compatible way). As far as I remember, the Py3 changes did not require breaking ABI (on Py2). -- Pauli Virtanen From sierra_mtnview at sbcglobal.net Fri May 28 22:59:50 2010 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 28 May 2010 19:59:50 -0700 Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? In-Reply-To: References: <4C005591.1060900@sbcglobal.net> Message-ID: <4C008326.3060404@sbcglobal.net> An HTML attachment was scrubbed... URL: From aarchiba at physics.mcgill.ca Fri May 28 23:31:02 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Sat, 29 May 2010 00:31:02 -0300 Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? In-Reply-To: <4C008326.3060404@sbcglobal.net> References: <4C005591.1060900@sbcglobal.net> <4C008326.3060404@sbcglobal.net> Message-ID: On 28 May 2010 23:59, Wayne Watson wrote: > That opened a few avenues. After reading this, I went on a merry search with > Google. I hit upon one interesting book, Handbook of CCD astronomy (Steve B. > Howell), that discusses PSFs. A Amazon Look Inside suggests this is mostly > about h/w. I tried to figure out how to reach the scipy mail list, but, as > once a year ago, couldn't figure out the newsgroup GMANE connection. This > search recalled to mind my Handbook of Astro Image? Processing by Berry and > Burnell. It has a few pages on the PSF. In the ref section for that > material(PSFs) there's another ref to Steve Howell that may be of use: Astro > CCD Observing and Reduction Techniques, ASP, Pacific Conf. Series, vol. 23, > 1992. There are further Berry and Burnell refs that may be applicable. Ah, sorry, I've been at an astro conference all week, I should have expanded that acronym. PSF is short for "Point Spread Function"; the idea is that with an optically good telescope, a point source anywhere in the field of view produces a blob of characteristic shape (often roughly a two-dimensional Gaussian) in your detector. The shape and size of this blob is set by your optics (including diffraction) and the atmospheric seeing. A star, being intrinsically a point source, produces a brighter or less bright version of this blob centered on the star's true position. To accurately measure the star's position (and often brightness) one usually fits a model blob to the noisy blob coming from the star of interest. I should note that this requires you to have more pixels than you "need", so that even a point source is spread over many pixels; without this it's impossible to get subpixel positioning (among other things). Older consumer digital cameras often lacked this, since it was difficult to put enough pixels on a CCD, but fortunately megapixel mania has helpfully ensured that no matter how sharp the focus, every feature in your image is smeared over many pixels. > I probed IRAF, SciPy, and Python, but it looks like a steep learning curve. > The SciPy tutorial page looks like overkill. They have what looks like very > large tutorials. Perhaps daunting. I did a quick shot at pyraf, a tutorial > page, but note it has a prereq of IRAF. Another daunting path. Wait, you think SciPy has too many tutorials? Or that they're too detailed? Just pick a short, easy, or sketchy one then. Here's one that's all three: >>> import scipy.stats >>> scipy.stats.norm.cdf(3) 0.9986501019683699 That's the value of the CDF of a standard normal at three sigma, i.e., one minus the false positive probability for a one-sided three sigma detection. > Well, maybe a DIY approach will do the trick for me. I haven't used IRAF yet (though I have data sets waiting), and I do understand the urge to write your own code rather than understanding someone else's, but let me point out that reliably extracting source parameters from astronomical images is *hard* and requires cleverness, attention to countless special cases, troubleshooting, and experience. But it's an old problem, and astronomers have taken all of the needed things listed above and built them into IRAF. Do consider using it. Anne > On 5/28/2010 5:41 PM, Anne Archibald wrote: > > On 28 May 2010 21:09, Charles R Harris wrote: > > > On Fri, May 28, 2010 at 5:45 PM, Wayne Watson > wrote: > > > Suppose I have a 640x480 pixel video chip and would like to find star > images on it, possible planets and the moon. A possibility of noise > exits, or bright pixels. Is there a known method for finding the > centroids of these astro objects? > > > > You can threshold the image and then cluster the pixels in objects. I've > done this on occasion using my own software, but I think there might be > something in scipy/ndimage that does the same. Someone here will know. > > > There are sort of two passes here - the first is to find all the > stars, and the second is to fine down their positions, ideally to less > than a pixel. For the former, thresholding and clumping is probably > the way to go. > > For the latter I think a standard approach is PSF fitting - that is, > you fit (say) a two-dimensional Gaussian to the pixels near your star. > You'll fit for at least central (subpixel) position, probably radius, > and maybe eccentricity and orientation. You might even fit for a more > sophisticated PSF (doughnuts are natural for Schmidt-Cassegrain > telescopes, or the diffraction pattern of your spider). Any spot whose > best-fit PSF is just one pixel wide is noise or a cosmic ray hit or a > hotpixel; any spot whose best-fit PSF is huge is a detector splodge or > a planet or galaxy. > > All this assumes that your CCD has more resolution than your optics; > if this is not the case you're more or less stuck, since a star is > then just a bright pixel. In this case your problem is one of > combining multiple offset images, dark skies, and dome flats to try to > distinguish detector crud and cosmic ray hits from actual stars. It > can be done, but it will be a colossal pain if your pointing accuracy > is not subpixel (which it probably won't be). > > In any case, my optical friends tell me that the Right Way to do all > this is to use all the code built into IRAF (or its python wrapper, > pyraf) that does all this difficult work for you. > > Anne > P.S. if your images have been fed through JPEG or some other lossy > compression the process will become an utter nightmare. -A > > > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Wayne Watson (Watson Adventures, Prop., Nevada City, CA) > > (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) > Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet > > There are no statues or memorials dedicated to > Thomas Paine for his substantial part in the > American Revolution. > > -- An observation in The Science of Liberty > by Timothy Ferris > > > Web Page: > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From d.l.goldsmith at gmail.com Sat May 29 00:16:18 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 28 May 2010 21:16:18 -0700 Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? In-Reply-To: References: <4C005591.1060900@sbcglobal.net> <4C008326.3060404@sbcglobal.net> Message-ID: On Fri, May 28, 2010 at 8:31 PM, Anne Archibald wrote: > On 28 May 2010 23:59, Wayne Watson wrote: > > That opened a few avenues. After reading this, I went on a merry search > with > > Google. I hit upon one interesting book, Handbook of CCD astronomy (Steve > B. > > Howell), that discusses PSFs. A Amazon Look Inside suggests this is > mostly > > about h/w. I tried to figure out how to reach the scipy mail list, but, > as > > once a year ago, couldn't figure out the newsgroup GMANE connection. This > > search recalled to mind my Handbook of Astro Image Processing by Berry > and > > Burnell. It has a few pages on the PSF. In the ref section for that > > material(PSFs) there's another ref to Steve Howell that may be of use: > Astro > > CCD Observing and Reduction Techniques, ASP, Pacific Conf. Series, vol. > 23, > > 1992. There are further Berry and Burnell refs that may be applicable. > > Ah, sorry, I've been at an astro conference all week, I should have > expanded that acronym. PSF is short for "Point Spread Function"; the > idea is that with an optically good telescope, a point source anywhere > in the field of view produces a blob of characteristic shape (often > roughly a two-dimensional Gaussian) in your detector. The shape and > size of this blob is set by your optics (including diffraction) and > the atmospheric seeing. A star, being intrinsically a point source, > produces a brighter or less bright version of this blob centered on > the star's true position. To accurately measure the star's position > (and often brightness) one usually fits a model blob to the noisy blob > coming from the star of interest. > > I should note that this requires you to have more pixels than you > "need", so that even a point source is spread over many pixels; > without this it's impossible to get subpixel positioning (among other > things). Older consumer digital cameras often lacked this, since it > was difficult to put enough pixels on a CCD, but fortunately megapixel > mania has helpfully ensured that no matter how sharp the focus, every > feature in your image is smeared over many pixels. > > > I probed IRAF, SciPy, and Python, but it looks like a steep learning > curve. > > The SciPy tutorial page looks like overkill. They have what looks like > very > > large tutorials. Perhaps daunting. I did a quick shot at pyraf, a > tutorial > > page, but note it has a prereq of IRAF. Another daunting path. > > Wait, you think SciPy has too many tutorials? Or that they're too > detailed? Just pick a short, easy, or sketchy one then. Here's one > that's all three: > > >>> import scipy.stats > >>> scipy.stats.norm.cdf(3) > 0.9986501019683699 > > That's the value of the CDF of a standard normal at three sigma, i.e., > one minus the false positive probability for a one-sided three sigma > detection. > > > Well, maybe a DIY approach will do the trick for me. > > I haven't used IRAF yet (though I have data sets waiting), and I do > understand the urge to write your own code rather than understanding > someone else's, but let me point out that reliably extracting source > parameters from astronomical images is *hard* and requires cleverness, > attention to countless special cases, troubleshooting, and experience. > But it's an old problem, and astronomers have taken all of the needed > things listed above and built them into IRAF. Do consider using it. > > Anne > Plus, if you're in the field of astronomy, knowing py/IRAF will be a *big* gold star on your resume. :-) DG > On 5/28/2010 5:41 PM, Anne Archibald wrote: > > > > On 28 May 2010 21:09, Charles R Harris > wrote: > > > > > > On Fri, May 28, 2010 at 5:45 PM, Wayne Watson < > sierra_mtnview at sbcglobal.net> > > wrote: > > > > > > Suppose I have a 640x480 pixel video chip and would like to find star > > images on it, possible planets and the moon. A possibility of noise > > exits, or bright pixels. Is there a known method for finding the > > centroids of these astro objects? > > > > > > > > You can threshold the image and then cluster the pixels in objects. I've > > done this on occasion using my own software, but I think there might be > > something in scipy/ndimage that does the same. Someone here will know. > > > > > > There are sort of two passes here - the first is to find all the > > stars, and the second is to fine down their positions, ideally to less > > than a pixel. For the former, thresholding and clumping is probably > > the way to go. > > > > For the latter I think a standard approach is PSF fitting - that is, > > you fit (say) a two-dimensional Gaussian to the pixels near your star. > > You'll fit for at least central (subpixel) position, probably radius, > > and maybe eccentricity and orientation. You might even fit for a more > > sophisticated PSF (doughnuts are natural for Schmidt-Cassegrain > > telescopes, or the diffraction pattern of your spider). Any spot whose > > best-fit PSF is just one pixel wide is noise or a cosmic ray hit or a > > hotpixel; any spot whose best-fit PSF is huge is a detector splodge or > > a planet or galaxy. > > > > All this assumes that your CCD has more resolution than your optics; > > if this is not the case you're more or less stuck, since a star is > > then just a bright pixel. In this case your problem is one of > > combining multiple offset images, dark skies, and dome flats to try to > > distinguish detector crud and cosmic ray hits from actual stars. It > > can be done, but it will be a colossal pain if your pointing accuracy > > is not subpixel (which it probably won't be). > > > > In any case, my optical friends tell me that the Right Way to do all > > this is to use all the code built into IRAF (or its python wrapper, > > pyraf) that does all this difficult work for you. > > > > Anne > > P.S. if your images have been fed through JPEG or some other lossy > > compression the process will become an utter nightmare. -A > > > > > > > > Chuck > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > -- > > Wayne Watson (Watson Adventures, Prop., Nevada City, CA) > > > > (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) > > Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet > > > > There are no statues or memorials dedicated to > > Thomas Paine for his substantial part in the > > American Revolution. > > > > -- An observation in The Science of Liberty > > by Timothy Ferris > > > > > > Web Page: > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sat May 29 13:27:19 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 29 May 2010 10:27:19 -0700 Subject: [Numpy-discussion] ix_ and copies Message-ID: Will making changes to arr2 never change arr1 if arr2 = arr1[np.ix_(*lists)] where lists is a list of (index) lists? np.ix_ returns a tuple of arrays so I'm guessing (and hoping) the answer is yes. From josef.pktd at gmail.com Sat May 29 13:56:04 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 29 May 2010 13:56:04 -0400 Subject: [Numpy-discussion] assert_almost_equal and inf Message-ID: I'm getting some test failures in my code with infs in numpy 1.4, nan treatment has been corrected in numpy.testing >>> assert_almost_equal(np.array([-5.23722324, -np.nan, -5.23722324]), np.array([-5.23722324, -np.nan, -5.23722324])) but it seems that inf doesn't compare almost equal >>> assert_almost_equal(np.array([-5.23722324, -np.inf, -5.23722324]), np.array([-5.23722324, -np.inf, -5.23722324])) Traceback (most recent call last): ... ValueError: Arrays are not almost equal x: array([-5.23722324, -Inf, -5.23722324]) y: array([-5.23722324, -Inf, -5.23722324]) numpy 1.4.0 has this been changed already? or is it not desired? Josef From robert.kern at gmail.com Sat May 29 14:09:56 2010 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 29 May 2010 13:09:56 -0500 Subject: [Numpy-discussion] ix_ and copies In-Reply-To: References: Message-ID: On Sat, May 29, 2010 at 12:27, Keith Goodman wrote: > Will making changes to arr2 never change arr1 if > > arr2 = arr1[np.ix_(*lists)] > > where lists is a list of (index) lists? np.ix_ returns a tuple of > arrays so I'm guessing (and hoping) the answer is yes. Correct. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sierra_mtnview at sbcglobal.net Sat May 29 14:29:56 2010 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sat, 29 May 2010 11:29:56 -0700 Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? In-Reply-To: References: <4C005591.1060900@sbcglobal.net> <4C008326.3060404@sbcglobal.net> Message-ID: <4C015D24.30905@sbcglobal.net> I managed to find some FORTRAN code that is relevant. Algorithms for CCD Stellar Photometry,K. Mighell, Kitt Peak, . He discusses PSF and provides a few "starter" programs, centroid and peak determination. He discusses sky background (, correction, cleanup?) and points to his photometry package). My chip is not big at all, about 1/2". No large chips for what I'm doing. I don't mind starting with simple methods to see how far they get me. That's a learning experience that often provides a better understanding of the whole process. At this point, this is about exploration to see how setting up an image to analyze can be accomplished. Once done, then well known methods are available for my purpose. Where this is going at the moment is to do source extraction. Towards that end I found an interesting source that does a good job of summarizing how one does it. Here's an excerpt. ===================================== Star-Extraction Algorithms The conventional star-extraction algorithms[4,5] widely used for ground-based data reductions are comprised of two basic steps. The first step is to fit the sky background in order to detect the star pixels from the background. The second is to search for pixels that belong to the same star, and group them together to compute the center coordinates of the star. A typical implementation of such an algorithm is described as below. 1) Background Estimation First, the sky background in each pixel is estimated in order to seperate the object signal from the background signal[4]. The image is divided into many grid cells, each including a number of pixels. In each grid cell, an estimate of the background level is made and is assigned to the center pixel as its nominal value. A relatively "coarse" mesh of the sky background distribution is thus constructed. The value at any other pixel can be obtained from 2-dimensional spline-fitting interpolation of the mesh. 2) Searching for Star Pixels After the interpolated sky background is subtracted from the image, the sky contribution to any star pixel becomes negligible. Only small residues remain in the sky pixels and the mean residue value can be taken as the standard deviation of the background. After convolution with a Gaussian filter, the image is traversed to search for pixels whose values are N times higher than the background standard deviation. Each group of connected high-value pixels are joined together to form an extracted star. The conventional algorithms described above are time-consuming, although they have the advantage of detecting sources with very low signal-to-noise ratios (SNRs). They are not suitable for satellite onboard data reductions since the onboard computers are usually several hundred times slower than even a normal personal computer available on the ground. SciPy. Hmm, this looks reasonable. . Somehow I missed it and instead landed on , which looks fairly imposing. Those are tar and gzip files. In any case, the whole page looks quite imposing. Where did you find the simple one below? =============================== On 5/28/2010 8:31 PM, Anne Archibald wrote: > On 28 May 2010 23:59, Wayne Watson wrote: > >> That opened a few avenues. After reading this, I went on a merry search with >> Google. I hit upon one interesting book, Handbook of CCD astronomy (Steve B. >> Howell), that discusses PSFs. A Amazon Look Inside suggests this is mostly >> about h/w. I tried to figure out how to reach the scipy mail list, but, as >> once a year ago, couldn't figure out the newsgroup GMANE connection. This >> search recalled to mind my Handbook of Astro Image Processing by Berry and >> Burnell. It has a few pages on the PSF. In the ref section for that >> material(PSFs) there's another ref to Steve Howell that may be of use: Astro >> CCD Observing and Reduction Techniques, ASP, Pacific Conf. Series, vol. 23, >> 1992. There are further Berry and Burnell refs that may be applicable. >> > Ah, sorry, I've been at an astro conference all week, I should have > expanded that acronym. PSF is short for "Point Spread Function"; the > idea is that with an optically good telescope, a point source anywhere > in the field of view produces a blob of characteristic shape (often > roughly a two-dimensional Gaussian) in your detector. The shape and > size of this blob is set by your optics (including diffraction) and > the atmospheric seeing. A star, being intrinsically a point source, > produces a brighter or less bright version of this blob centered on > the star's true position. To accurately measure the star's position > (and often brightness) one usually fits a model blob to the noisy blob > coming from the star of interest. > > I should note that this requires you to have more pixels than you > "need", so that even a point source is spread over many pixels; > without this it's impossible to get subpixel positioning (among other > things). Older consumer digital cameras often lacked this, since it > was difficult to put enough pixels on a CCD, but fortunately megapixel > mania has helpfully ensured that no matter how sharp the focus, every > feature in your image is smeared over many pixels. > > >> I probed IRAF, SciPy, and Python, but it looks like a steep learning curve. >> The SciPy tutorial page looks like overkill. They have what looks like very >> large tutorials. Perhaps daunting. I did a quick shot at pyraf, a tutorial >> page, but note it has a prereq of IRAF. Another daunting path. >> > Wait, you think SciPy has too many tutorials? Or that they're too > detailed? Just pick a short, easy, or sketchy one then. Here's one > that's all three: > > >>>> import scipy.stats >>>> scipy.stats.norm.cdf(3) >>>> > 0.9986501019683699 > > That's the value of the CDF of a standard normal at three sigma, i.e., > one minus the false positive probability for a one-sided three sigma > detection. > > >> Well, maybe a DIY approach will do the trick for me. >> > I haven't used IRAF yet (though I have data sets waiting), and I do > understand the urge to write your own code rather than understanding > someone else's, but let me point out that reliably extracting source > parameters from astronomical images is *hard* and requires cleverness, > attention to countless special cases, troubleshooting, and experience. > But it's an old problem, and astronomers have taken all of the needed > things listed above and built them into IRAF. Do consider using it. > > Anne > > >> On 5/28/2010 5:41 PM, Anne Archibald wrote: >> >> On 28 May 2010 21:09, Charles R Harris wrote: >> >> >> On Fri, May 28, 2010 at 5:45 PM, Wayne Watson >> wrote: >> >> >> Suppose I have a 640x480 pixel video chip and would like to find star >> images on it, possible planets and the moon. A possibility of noise >> exits, or bright pixels. Is there a known method for finding the >> centroids of these astro objects? >> >> >> >> You can threshold the image and then cluster the pixels in objects. I've >> done this on occasion using my own software, but I think there might be >> something in scipy/ndimage that does the same. Someone here will know. >> >> >> There are sort of two passes here - the first is to find all the >> stars, and the second is to fine down their positions, ideally to less >> than a pixel. For the former, thresholding and clumping is probably >> the way to go. >> >> For the latter I think a standard approach is PSF fitting - that is, >> you fit (say) a two-dimensional Gaussian to the pixels near your star. >> You'll fit for at least central (subpixel) position, probably radius, >> and maybe eccentricity and orientation. You might even fit for a more >> sophisticated PSF (doughnuts are natural for Schmidt-Cassegrain >> telescopes, or the diffraction pattern of your spider). Any spot whose >> best-fit PSF is just one pixel wide is noise or a cosmic ray hit or a >> hotpixel; any spot whose best-fit PSF is huge is a detector splodge or >> a planet or galaxy. >> >> All this assumes that your CCD has more resolution than your optics; >> if this is not the case you're more or less stuck, since a star is >> then just a bright pixel. In this case your problem is one of >> combining multiple offset images, dark skies, and dome flats to try to >> distinguish detector crud and cosmic ray hits from actual stars. It >> can be done, but it will be a colossal pain if your pointing accuracy >> is not subpixel (which it probably won't be). >> >> In any case, my optical friends tell me that the Right Way to do all >> this is to use all the code built into IRAF (or its python wrapper, >> pyraf) that does all this difficult work for you. >> >> Anne >> P.S. if your images have been fed through JPEG or some other lossy >> compression the process will become an utter nightmare. -A >> >> >> >> Chuck >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> -- >> Wayne Watson (Watson Adventures, Prop., Nevada City, CA) >> >> (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) >> Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet >> >> There are no statues or memorials dedicated to >> Thomas Paine for his substantial part in the >> American Revolution. >> >> -- An observation in The Science of Liberty >> by Timothy Ferris >> >> >> Web Page: >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet There are no statues or memorials dedicated to Thomas Paine for his substantial part in the American Revolution. -- An observation in The Science of Liberty by Timothy Ferris Web Page: From kwgoodman at gmail.com Sat May 29 14:35:13 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 29 May 2010 11:35:13 -0700 Subject: [Numpy-discussion] shuffle a slice Message-ID: np.random.shuffle: "Modify a sequence in-place by shuffling its contents." Matches doc string: >> a = np.arange(10) >> np.random.shuffle(a[:-1]) >> a array([0, 7, 8, 4, 3, 6, 2, 1, 5, 9]) Doesn't match doc string: >> l = range(10) >> np.random.shuffle(l[:-1]) >> l [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] Is there any way for numpy to catch this? From sierra_mtnview at sbcglobal.net Sat May 29 14:44:28 2010 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sat, 29 May 2010 11:44:28 -0700 Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? In-Reply-To: References: <4C005591.1060900@sbcglobal.net> <4C008326.3060404@sbcglobal.net> Message-ID: <4C01608C.7000003@sbcglobal.net> An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 29 14:45:35 2010 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 29 May 2010 13:45:35 -0500 Subject: [Numpy-discussion] shuffle a slice In-Reply-To: References: Message-ID: On Sat, May 29, 2010 at 13:35, Keith Goodman wrote: > np.random.shuffle: "Modify a sequence in-place by shuffling its contents." > > Matches doc string: > >>> a = np.arange(10) >>> np.random.shuffle(a[:-1]) >>> a > ? array([0, 7, 8, 4, 3, 6, 2, 1, 5, 9]) > > Doesn't match doc string: > >>> l = range(10) >>> np.random.shuffle(l[:-1]) >>> l > ? [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] This behavior does match the doc-string. l[:-1] creates a new list unconnected to the original list. np.random.shuffle() then shuffles that new list in-place. > Is there any way for numpy to catch this? Nope. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Sat May 29 15:03:44 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 29 May 2010 12:03:44 -0700 Subject: [Numpy-discussion] shuffle a slice In-Reply-To: References: Message-ID: On Sat, May 29, 2010 at 11:45 AM, Robert Kern wrote: > On Sat, May 29, 2010 at 13:35, Keith Goodman wrote: >> np.random.shuffle: "Modify a sequence in-place by shuffling its contents." >> >> Matches doc string: >> >>>> a = np.arange(10) >>>> np.random.shuffle(a[:-1]) >>>> a >> ? array([0, 7, 8, 4, 3, 6, 2, 1, 5, 9]) >> >> Doesn't match doc string: >> >>>> l = range(10) >>>> np.random.shuffle(l[:-1]) >>>> l >> ? [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > > This behavior does match the doc-string. l[:-1] creates a new list > unconnected to the original list. np.random.shuffle() then shuffles > that new list in-place. > >> Is there any way for numpy to catch this? > > Nope. The best way to remember something is to turn it into a dumb question and then post to a large mailing list. Make sure not to use an alias. From pav at iki.fi Sat May 29 15:09:49 2010 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 29 May 2010 19:09:49 +0000 (UTC) Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? References: <4C005591.1060900@sbcglobal.net> <4C008326.3060404@sbcglobal.net> <4C015D24.30905@sbcglobal.net> Message-ID: Sat, 29 May 2010 11:29:56 -0700, Wayne Watson wrote: [clip] > SciPy. Hmm, this looks reasonable. . > Somehow I missed it and instead landed on > , which looks > fairly imposing. Those are tar and gzip files. There's also the PDF file that contains the main text. > In any case, the whole page looks quite imposing. Where did you > find the simple one below? The scipy.org wiki is organically grown and confusing at times (I moved the page you landed on to a more clear place). The correct places where this stuff should be is http://docs.scipy.org/ http://scipy.org/Additional_Documentation -- Pauli Virtanen From cohen at lpta.in2p3.fr Sat May 29 17:04:57 2010 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Sat, 29 May 2010 23:04:57 +0200 Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? In-Reply-To: <4C015D24.30905@sbcglobal.net> References: <4C005591.1060900@sbcglobal.net> <4C008326.3060404@sbcglobal.net> <4C015D24.30905@sbcglobal.net> Message-ID: <4C018179.5030508@lpta.in2p3.fr> I have used sextractor in the past : http://terapix.iap.fr/rubrique.php?id_rubrique=91/ Johann On 05/29/2010 08:29 PM, Wayne Watson wrote: > I managed to find some FORTRAN code that is relevant. Algorithms for CCD > Stellar Photometry,K. Mighell, Kitt Peak, > . He > discusses PSF and provides a few "starter" programs, centroid and peak > determination. He discusses sky background (, correction, cleanup?) and > points to his photometry package). > > My chip is not big at all, about 1/2". No large chips for what I'm > doing. I don't mind starting with simple methods to see how far they > get me. That's a learning experience that often provides a better > understanding of the whole process. At this point, this is about > exploration to see how setting up an image to analyze can be > accomplished. Once done, then well known methods are available for my > purpose. > > Where this is going at the moment is to do source extraction. Towards > that end I found an interesting source that does a good job of > summarizing how one does it. Here's an excerpt. > ===================================== > Star-Extraction Algorithms > The conventional star-extraction algorithms[4,5] widely used for > ground-based data reductions are comprised of two basic steps. The first > step is to fit the sky background in order to detect the star pixels > from the background. The second is to search for pixels that belong to > the same star, and group them together to compute the center coordinates > of the star. A typical implementation of such an algorithm is described > as below. > 1) > Background Estimation > First, the sky background in each pixel is estimated in order to > seperate the object signal from the background signal[4]. The image is > divided into many grid cells, each including a number of pixels. In each > grid cell, an estimate of the background level is made and is assigned > to the center pixel as its nominal value. A relatively "coarse" mesh of > the sky background distribution is thus constructed. The value at any > other pixel can be obtained from 2-dimensional spline-fitting > interpolation of the mesh. > 2) > Searching for Star Pixels > After the interpolated sky background is subtracted from the image, the > sky contribution to any star pixel becomes negligible. Only small > residues remain in the sky pixels and the mean residue value can be > taken as the standard deviation of the background. After convolution > with a Gaussian filter, the image is traversed to search for pixels > whose values are N times higher than the background standard deviation. > Each group of connected high-value pixels are joined together to form an > extracted star. > The conventional algorithms described above are time-consuming, although > they have the advantage of detecting sources with very low > signal-to-noise ratios (SNRs). They are not suitable for satellite > onboard data reductions since the onboard computers are usually several > hundred times slower than even a normal personal computer available on > the ground. > > SciPy. Hmm, this looks reasonable.. > Somehow I missed it and instead landed on > , which looks > fairly imposing. Those are tar and gzip files. In any case, the whole > page looks quite imposing. Where did you find the simple one below? > =============================== > > > On 5/28/2010 8:31 PM, Anne Archibald wrote: > >> On 28 May 2010 23:59, Wayne Watson wrote: >> >> >>> That opened a few avenues. After reading this, I went on a merry search with >>> Google. I hit upon one interesting book, Handbook of CCD astronomy (Steve B. >>> Howell), that discusses PSFs. A Amazon Look Inside suggests this is mostly >>> about h/w. I tried to figure out how to reach the scipy mail list, but, as >>> once a year ago, couldn't figure out the newsgroup GMANE connection. This >>> search recalled to mind my Handbook of Astro Image Processing by Berry and >>> Burnell. It has a few pages on the PSF. In the ref section for that >>> material(PSFs) there's another ref to Steve Howell that may be of use: Astro >>> CCD Observing and Reduction Techniques, ASP, Pacific Conf. Series, vol. 23, >>> 1992. There are further Berry and Burnell refs that may be applicable. >>> >>> >> Ah, sorry, I've been at an astro conference all week, I should have >> expanded that acronym. PSF is short for "Point Spread Function"; the >> idea is that with an optically good telescope, a point source anywhere >> in the field of view produces a blob of characteristic shape (often >> roughly a two-dimensional Gaussian) in your detector. The shape and >> size of this blob is set by your optics (including diffraction) and >> the atmospheric seeing. A star, being intrinsically a point source, >> produces a brighter or less bright version of this blob centered on >> the star's true position. To accurately measure the star's position >> (and often brightness) one usually fits a model blob to the noisy blob >> coming from the star of interest. >> >> I should note that this requires you to have more pixels than you >> "need", so that even a point source is spread over many pixels; >> without this it's impossible to get subpixel positioning (among other >> things). Older consumer digital cameras often lacked this, since it >> was difficult to put enough pixels on a CCD, but fortunately megapixel >> mania has helpfully ensured that no matter how sharp the focus, every >> feature in your image is smeared over many pixels. >> >> >> >>> I probed IRAF, SciPy, and Python, but it looks like a steep learning curve. >>> The SciPy tutorial page looks like overkill. They have what looks like very >>> large tutorials. Perhaps daunting. I did a quick shot at pyraf, a tutorial >>> page, but note it has a prereq of IRAF. Another daunting path. >>> >>> >> Wait, you think SciPy has too many tutorials? Or that they're too >> detailed? Just pick a short, easy, or sketchy one then. Here's one >> that's all three: >> >> >> >>>>> import scipy.stats >>>>> scipy.stats.norm.cdf(3) >>>>> >>>>> >> 0.9986501019683699 >> >> That's the value of the CDF of a standard normal at three sigma, i.e., >> one minus the false positive probability for a one-sided three sigma >> detection. >> >> >> >>> Well, maybe a DIY approach will do the trick for me. >>> >>> >> I haven't used IRAF yet (though I have data sets waiting), and I do >> understand the urge to write your own code rather than understanding >> someone else's, but let me point out that reliably extracting source >> parameters from astronomical images is *hard* and requires cleverness, >> attention to countless special cases, troubleshooting, and experience. >> But it's an old problem, and astronomers have taken all of the needed >> things listed above and built them into IRAF. Do consider using it. >> >> Anne >> >> >> >>> On 5/28/2010 5:41 PM, Anne Archibald wrote: >>> >>> On 28 May 2010 21:09, Charles R Harris wrote: >>> >>> >>> On Fri, May 28, 2010 at 5:45 PM, Wayne Watson >>> wrote: >>> >>> >>> Suppose I have a 640x480 pixel video chip and would like to find star >>> images on it, possible planets and the moon. A possibility of noise >>> exits, or bright pixels. Is there a known method for finding the >>> centroids of these astro objects? >>> >>> >>> >>> You can threshold the image and then cluster the pixels in objects. I've >>> done this on occasion using my own software, but I think there might be >>> something in scipy/ndimage that does the same. Someone here will know. >>> >>> >>> There are sort of two passes here - the first is to find all the >>> stars, and the second is to fine down their positions, ideally to less >>> than a pixel. For the former, thresholding and clumping is probably >>> the way to go. >>> >>> For the latter I think a standard approach is PSF fitting - that is, >>> you fit (say) a two-dimensional Gaussian to the pixels near your star. >>> You'll fit for at least central (subpixel) position, probably radius, >>> and maybe eccentricity and orientation. You might even fit for a more >>> sophisticated PSF (doughnuts are natural for Schmidt-Cassegrain >>> telescopes, or the diffraction pattern of your spider). Any spot whose >>> best-fit PSF is just one pixel wide is noise or a cosmic ray hit or a >>> hotpixel; any spot whose best-fit PSF is huge is a detector splodge or >>> a planet or galaxy. >>> >>> All this assumes that your CCD has more resolution than your optics; >>> if this is not the case you're more or less stuck, since a star is >>> then just a bright pixel. In this case your problem is one of >>> combining multiple offset images, dark skies, and dome flats to try to >>> distinguish detector crud and cosmic ray hits from actual stars. It >>> can be done, but it will be a colossal pain if your pointing accuracy >>> is not subpixel (which it probably won't be). >>> >>> In any case, my optical friends tell me that the Right Way to do all >>> this is to use all the code built into IRAF (or its python wrapper, >>> pyraf) that does all this difficult work for you. >>> >>> Anne >>> P.S. if your images have been fed through JPEG or some other lossy >>> compression the process will become an utter nightmare. -A >>> >>> >>> >>> Chuck >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> -- >>> Wayne Watson (Watson Adventures, Prop., Nevada City, CA) >>> >>> (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) >>> Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet >>> >>> There are no statues or memorials dedicated to >>> Thomas Paine for his substantial part in the >>> American Revolution. >>> >>> -- An observation in The Science of Liberty >>> by Timothy Ferris >>> >>> >>> Web Page: >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > From aarchiba at physics.mcgill.ca Sat May 29 17:49:11 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Sat, 29 May 2010 18:49:11 -0300 Subject: [Numpy-discussion] ix_ and copies In-Reply-To: References: Message-ID: On 29 May 2010 15:09, Robert Kern wrote: > On Sat, May 29, 2010 at 12:27, Keith Goodman wrote: >> Will making changes to arr2 never change arr1 if >> >> arr2 = arr1[np.ix_(*lists)] >> >> where lists is a list of (index) lists? np.ix_ returns a tuple of >> arrays so I'm guessing (and hoping) the answer is yes. > > Correct. To expand: any time you do fancy indexing - that is, index by anything but a tuple of integers or slice objects - you get back a copy. Anne > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Sun May 30 10:25:25 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 30 May 2010 07:25:25 -0700 Subject: [Numpy-discussion] ix_ and copies In-Reply-To: References: Message-ID: On Sat, May 29, 2010 at 2:49 PM, Anne Archibald wrote: > On 29 May 2010 15:09, Robert Kern wrote: >> On Sat, May 29, 2010 at 12:27, Keith Goodman wrote: >>> Will making changes to arr2 never change arr1 if >>> >>> arr2 = arr1[np.ix_(*lists)] >>> >>> where lists is a list of (index) lists? np.ix_ returns a tuple of >>> arrays so I'm guessing (and hoping) the answer is yes. >> >> Correct. > > To expand: any time you do fancy indexing - that is, index by anything > but a tuple of integers or slice objects - you get back a copy. I have never seen such a simple and clear definition of the line between regular and fancy indexing. To make sure I understand I'll try to expand. Is the following right? Regular indexing (no copy made): int float bool slice tuple of any combination of the above Fancy indexing (copy made): Any indexing that is not regular indexing From aarchiba at physics.mcgill.ca Sun May 30 10:49:39 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Sun, 30 May 2010 11:49:39 -0300 Subject: [Numpy-discussion] ix_ and copies In-Reply-To: References: Message-ID: On 30 May 2010 11:25, Keith Goodman wrote: > On Sat, May 29, 2010 at 2:49 PM, Anne Archibald > wrote: >> On 29 May 2010 15:09, Robert Kern wrote: >>> On Sat, May 29, 2010 at 12:27, Keith Goodman wrote: >>>> Will making changes to arr2 never change arr1 if >>>> >>>> arr2 = arr1[np.ix_(*lists)] >>>> >>>> where lists is a list of (index) lists? np.ix_ returns a tuple of >>>> arrays so I'm guessing (and hoping) the answer is yes. >>> >>> Correct. >> >> To expand: any time you do fancy indexing - that is, index by anything >> but a tuple of integers or slice objects - you get back a copy. > > I have never seen such a simple and clear definition of the line > between regular and fancy indexing. > > To make sure I understand I'll try to expand. Is the following right? > > Regular indexing (no copy made): > int > float > bool > slice > tuple of any combination of the above I think you should not include bool on this list. Strictly speaking I believe you can use bools as if they were integers, but that's a little limited. Normally when one indexes with bools one is indexing with an array of bools, as a sort of condition index; that is fancy indexing. The underlying reason for the copy/no copy distinction is that numpy arrays must be evenly strided, that is, as you move along any axis, the space between data elements must not vary. So slicing is no problem, and supplying an integer is no problem. Supplying a float is kind of bogus but might work anyway. Supplying None or np.newaxis also works, since this just adds an axis of length one. > > Fancy indexing (copy made): > Any indexing that is not regular indexing The only two options here are (essentially) indexing with (tuples of) arrays of indices or indexing with boolean (condition) arrays. "Mixed" modes, where you supply a tuple containing some arrays of indices and/or some booleans but also some slice objects or integers, may work but may do something unexpected or may simply fail to work. There was last time I looked no systematic testing of such constructions, and the implementation was erratic. (This is largely a definitional issue; given the way numpy's arrays of indices and boolean indexing work it's not clear how one should interpret such a mixed indexing operation.) There have been occasional calls for a pure-python implementation of numpy indexing for reference purposes. I think such a thing would be fun to write, but I haven't had time. Anne > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ralf.gommers at googlemail.com Sun May 30 11:52:53 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 30 May 2010 23:52:53 +0800 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Sat, May 29, 2010 at 9:05 AM, Pauli Virtanen wrote: > Fri, 28 May 2010 12:13:57 -0500, Travis Oliphant wrote: > [clip] > > I think this makes sense so that we can reschedule NumPy 2.0 for > > September and still provide a release with the Python 3k changes (I am > > assuming these can be done in an ABI-compatible way). > > As far as I remember, the Py3 changes did not require breaking ABI (on > Py2). > > Good. How much work is it to remove datetime, and who wants to do it? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sierra_mtnview at sbcglobal.net Sun May 30 13:07:19 2010 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sun, 30 May 2010 10:07:19 -0700 Subject: [Numpy-discussion] Finding Star Images on a Photo (Video chip) Plate? In-Reply-To: References: <4C005591.1060900@sbcglobal.net> <4C008326.3060404@sbcglobal.net> <4C015D24.30905@sbcglobal.net> Message-ID: <4C029B47.2040005@sbcglobal.net> Good. I'll latch onto them (bookmark) them soon. I like the organic description. On 5/29/2010 12:09 PM, Pauli Virtanen wrote: > Sat, 29 May 2010 11:29:56 -0700, Wayne Watson wrote: > [clip] > >> SciPy. Hmm, this looks reasonable.. >> Somehow I missed it and instead landed on >> , which looks >> fairly imposing. Those are tar and gzip files. >> > There's also the PDF file that contains the main text. > > >> In any case, the whole page looks quite imposing. Where did you >> find the simple one below? >> > The scipy.org wiki is organically grown and confusing at times (I moved > the page you landed on to a more clear place). > > The correct places where this stuff should be is > > http://docs.scipy.org/ > > http://scipy.org/Additional_Documentation > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "Science and democracy are based on the rejection "of dogma." -- Dick Taverne, The March of Unreason Web Page: From charlesr.harris at gmail.com Sun May 30 14:06:58 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 30 May 2010 12:06:58 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Sun, May 30, 2010 at 9:52 AM, Ralf Gommers wrote: > > > On Sat, May 29, 2010 at 9:05 AM, Pauli Virtanen wrote: > >> Fri, 28 May 2010 12:13:57 -0500, Travis Oliphant wrote: >> [clip] >> > I think this makes sense so that we can reschedule NumPy 2.0 for >> > September and still provide a release with the Python 3k changes (I am >> > assuming these can be done in an ABI-compatible way). >> >> As far as I remember, the Py3 changes did not require breaking ABI (on >> Py2). >> >> Good. How much work is it to remove datetime, and who wants to do it? > > Hey, I thought that was your job ;) Maybe the best thing is to start by making a branch for the removal and we can argue about who gets the short end of the stick later... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sterini at gmail.com Sun May 30 15:32:09 2010 From: sterini at gmail.com (Ilya Sterin) Date: Sun, 30 May 2010 15:32:09 -0400 Subject: [Numpy-discussion] linalg.det throws IndexError on 2.6.5 Message-ID: I'm not sure what's causing this at this point and before I dig deeper thought someone can shed some light... On my Mac OS X python 2.6.1, numpy.linalg.det functions properly... >>> numpy.linalg.det(numpy.array([[1, 2], [3, 4]])) -2.0 On Centos 5 python 2.6.5, I get and IndexError (out of bounds)... >>> numpy.linalg.det(numpy.array([[1, 2], [3, 4]])) Traceback (most recent call last): File "", line 1, in File "/opt/python2.6/lib/python2.6/site-packages/numpy/linalg/linalg.py", line 1507, in det return (1.-2.*sign)*multiply.reduce(diagonal(a), axis=-1) File "/opt/python2.6/lib/python2.6/site-packages/numpy/core/fromnumeric.py", line 949, in diagonal return asarray(a).diagonal(offset, axis1, axis2) IndexError: index 12884901888 out of bounds 0<=index<4 Any ideas? Thanks. Ilya From charlesr.harris at gmail.com Sun May 30 16:02:28 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 30 May 2010 14:02:28 -0600 Subject: [Numpy-discussion] linalg.det throws IndexError on 2.6.5 In-Reply-To: References: Message-ID: On Sun, May 30, 2010 at 1:32 PM, Ilya Sterin wrote: > I'm not sure what's causing this at this point and before I dig deeper > thought someone can shed some light... > > On my Mac OS X python 2.6.1, numpy.linalg.det functions properly... > > >>> numpy.linalg.det(numpy.array([[1, 2], [3, 4]])) > -2.0 > > On Centos 5 python 2.6.5, I get and IndexError (out of bounds)... > > >>> numpy.linalg.det(numpy.array([[1, 2], [3, 4]])) > Traceback (most recent call last): > File "", line 1, in > File "/opt/python2.6/lib/python2.6/site-packages/numpy/linalg/linalg.py", > line 1507, in det > return (1.-2.*sign)*multiply.reduce(diagonal(a), axis=-1) > File > "/opt/python2.6/lib/python2.6/site-packages/numpy/core/fromnumeric.py", > line 949, in diagonal > return asarray(a).diagonal(offset, axis1, axis2) > IndexError: index 12884901888 out of bounds 0<=index<4 > > Any ideas? > > What numpy version? Also, who built the numpy/python? I rather suspect a type problem somewhere. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sterini at gmail.com Sun May 30 16:22:06 2010 From: sterini at gmail.com (Ilya Sterin) Date: Sun, 30 May 2010 16:22:06 -0400 Subject: [Numpy-discussion] linalg.det throws IndexError on 2.6.5 In-Reply-To: References: Message-ID: Numpy 1.4.1. I built it myself. The version of numpy in question on CentOS was built against a 2.6.5 version of python 64 bit binary and built/installed with pip. The version which is working on my OS X is a universal binary and both python and numpy are built as such. I'm running python in 64bit mode though and all works fine there. I just tried to install numpy as per yum instructions on numpy site with the default Centos python 2.4. It installed numpy 1.2 and the determinant function works there. I need to get this to work with 2.6 though, as my app relies on it. Any ideas? Thanks. Ilya On Sun, May 30, 2010 at 4:02 PM, Charles R Harris wrote: > > > On Sun, May 30, 2010 at 1:32 PM, Ilya Sterin wrote: >> >> I'm not sure what's causing this at this point and before I dig deeper >> thought someone can shed some light... >> >> On my Mac OS X python 2.6.1, numpy.linalg.det functions properly... >> >> >>> numpy.linalg.det(numpy.array([[1, 2], [3, 4]])) >> -2.0 >> >> On Centos 5 python 2.6.5, I get and IndexError (out of bounds)... >> >> >>> numpy.linalg.det(numpy.array([[1, 2], [3, 4]])) >> Traceback (most recent call last): >> ?File "", line 1, in >> ?File "/opt/python2.6/lib/python2.6/site-packages/numpy/linalg/linalg.py", >> line 1507, in det >> ? ?return (1.-2.*sign)*multiply.reduce(diagonal(a), axis=-1) >> ?File >> "/opt/python2.6/lib/python2.6/site-packages/numpy/core/fromnumeric.py", >> line 949, in diagonal >> ? ?return asarray(a).diagonal(offset, axis1, axis2) >> IndexError: index 12884901888 out of bounds 0<=index<4 >> >> Any ideas? >> > > What numpy version? Also, who built the numpy/python? I rather suspect a > type problem somewhere. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Sun May 30 17:15:13 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 30 May 2010 15:15:13 -0600 Subject: [Numpy-discussion] linalg.det throws IndexError on 2.6.5 In-Reply-To: References: Message-ID: On Sun, May 30, 2010 at 2:22 PM, Ilya Sterin wrote: > Numpy 1.4.1. I built it myself. > > The version of numpy in question on CentOS was built against a 2.6.5 > version of python 64 bit binary and built/installed with pip. > > The version which is working on my OS X is a universal binary and both > python and numpy are built as such. I'm running python in 64bit mode > though and all works fine there. > > I just tried to install numpy as per yum instructions on numpy site > with the default Centos python 2.4. It installed numpy 1.2 and the > determinant function works there. > > I need to get this to work with 2.6 though, as my app relies on it. Any > ideas? > > Thanks. > > It works fine for me on 2.6, which is why I asked for more details. $[charris at ubuntu ~]$ python Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.linalg.det(numpy.array([[1, 2], [3, 4]])) -2.0 >>> numpy.__version__ '1.4.1' Did you remove the previous installation of numpy, if any? And why not do the usual "sudo python setup.py install" thingie instead of pip? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun May 30 19:53:03 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 31 May 2010 07:53:03 +0800 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Mon, May 31, 2010 at 2:06 AM, Charles R Harris wrote: > > > On Sun, May 30, 2010 at 9:52 AM, Ralf Gommers > wrote: > >> >> >> On Sat, May 29, 2010 at 9:05 AM, Pauli Virtanen wrote: >> >>> Fri, 28 May 2010 12:13:57 -0500, Travis Oliphant wrote: >>> [clip] >>> > I think this makes sense so that we can reschedule NumPy 2.0 for >>> > September and still provide a release with the Python 3k changes (I am >>> > assuming these can be done in an ABI-compatible way). >>> >>> As far as I remember, the Py3 changes did not require breaking ABI (on >>> Py2). >>> >>> Good. How much work is it to remove datetime, and who wants to do it? >> >> > Hey, I thought that was your job ;) Maybe the best thing is to start by > making a branch for the removal and we can argue about who gets the short > end of the stick later... > > Manage the release yes, do the heavy lifting not necessarily:) OK, I'll make the branch tonight. If the removal just involves the same changes that were made for 1.4.1 then I can do it. But I'm not familiar with this code and if it's more work, I probably won't have time for it between the scipy 0.8.0 release and my 'real job', like you call it. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun May 30 20:23:55 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 30 May 2010 18:23:55 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Sun, May 30, 2010 at 5:53 PM, Ralf Gommers wrote: > > > On Mon, May 31, 2010 at 2:06 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, May 30, 2010 at 9:52 AM, Ralf Gommers < >> ralf.gommers at googlemail.com> wrote: >> >>> >>> >>> On Sat, May 29, 2010 at 9:05 AM, Pauli Virtanen wrote: >>> >>>> Fri, 28 May 2010 12:13:57 -0500, Travis Oliphant wrote: >>>> [clip] >>>> > I think this makes sense so that we can reschedule NumPy 2.0 for >>>> > September and still provide a release with the Python 3k changes (I am >>>> > assuming these can be done in an ABI-compatible way). >>>> >>>> As far as I remember, the Py3 changes did not require breaking ABI (on >>>> Py2). >>>> >>>> Good. How much work is it to remove datetime, and who wants to do it? >>> >>> >> Hey, I thought that was your job ;) Maybe the best thing is to start by >> making a branch for the removal and we can argue about who gets the short >> end of the stick later... >> >> Manage the release yes, do the heavy lifting not necessarily:) OK, I'll > make the branch tonight. > > If the removal just involves the same changes that were made for 1.4.1 then > I can do it. But I'm not familiar with this code and if it's more work, I > probably won't have time for it between the scipy 0.8.0 release and my 'real > job', like you call it. > > I think it may be a bit trickier because Travis made more changes. I think the relevant commits are r8113..r8115 and 8107..8108. After removing those we still need to remove the same stuff as we did for the 1.4.1 release. We will need some way of testing if the removal was successful. We probably want to make sure there is a documentation update also, maybe before making the branch. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sterini at gmail.com Sun May 30 23:18:01 2010 From: sterini at gmail.com (Ilya Sterin) Date: Sun, 30 May 2010 23:18:01 -0400 Subject: [Numpy-discussion] linalg.det throws IndexError on 2.6.5 In-Reply-To: References: Message-ID: Yeah, there were no previous installs. Well, I removed the pip installed numpy and did a python setup.py install and all works now. Really bizarre, I didn't think pip did much outside of the standard python setup.py install, other that some more automation. Thanks for the help. Ilya On Sun, May 30, 2010 at 5:15 PM, Charles R Harris wrote: > > > On Sun, May 30, 2010 at 2:22 PM, Ilya Sterin wrote: >> >> Numpy 1.4.1. ?I built it myself. >> >> The version of numpy in question on CentOS was built against a 2.6.5 >> version of python 64 bit binary and built/installed with pip. >> >> The version which is working on my OS X is a universal binary and both >> python and numpy are built as such. ?I'm running python in 64bit mode >> though and all works fine there. >> >> I just tried to install numpy as per yum instructions on numpy site >> with the default Centos python 2.4. ?It installed numpy 1.2 and the >> determinant function works there. >> >> I need to get this to work with 2.6 though, as my app relies on it. ?Any >> ideas? >> >> Thanks. >> > > It works fine for me on 2.6, which is why I asked for more details. > > $[charris at ubuntu ~]$ python > Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41) > [GCC 4.4.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy >>>> numpy.linalg.det(numpy.array([[1, 2], [3, 4]])) > -2.0 >>>> numpy.__version__ > '1.4.1' > > Did you remove the previous installation of numpy, if any? And why not do > the usual "sudo python setup.py install"? thingie instead of pip? > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From ralf.gommers at googlemail.com Mon May 31 06:54:49 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 31 May 2010 18:54:49 +0800 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Mon, May 31, 2010 at 8:23 AM, Charles R Harris wrote: > > > On Sun, May 30, 2010 at 5:53 PM, Ralf Gommers > wrote: > >> >> >> On Mon, May 31, 2010 at 2:06 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> Hey, I thought that was your job ;) Maybe the best thing is to start by >>>> making a branch for the removal and we can argue about who gets the short >>>> end of the stick later... >>>> >>>> Manage the release yes, do the heavy lifting not necessarily:) OK, I'll >> make the branch tonight. >> >> If the removal just involves the same changes that were made for 1.4.1 >> then I can do it. But I'm not familiar with this code and if it's more work, >> I probably won't have time for it between the scipy 0.8.0 release and my >> 'real job', like you call it. >> >> > I think it may be a bit trickier because Travis made more changes. I think > the relevant commits are r8113..r8115 and 8107..8108. After removing those > we still need to remove the same stuff as we did for the 1.4.1 release. We > will need some way of testing if the removal was successful. > That still looks like it's not an insane amount of work. We probably want to make sure there is a documentation update also, maybe > before making the branch. > > I checked and there's not too much to merge, most of the changes ( http://docs.scipy.org/numpy/patch/) don't apply cleanly or at all. The latter because they're docs for constants, lists, etc. The biggest chuck of recent changes is for the polynomial and chebyshev docs, can you give your opinion on those Charles? The OK to Apply is set to True for all of them, but I'm not sure who did that. The patch generation won't work for many of those docs, so if you could check if any docs should be merged manually right now that would be useful. @ David G: there are some conflicts in docs you recently edited, http://docs.scipy.org/numpy/merge/. Would you mind resolving those? Thanks, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon May 31 11:32:27 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 31 May 2010 09:32:27 -0600 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Mon, May 31, 2010 at 4:54 AM, Ralf Gommers wrote: > > > On Mon, May 31, 2010 at 8:23 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, May 30, 2010 at 5:53 PM, Ralf Gommers < >> ralf.gommers at googlemail.com> wrote: >> >>> >>> >>> On Mon, May 31, 2010 at 2:06 AM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> Hey, I thought that was your job ;) Maybe the best thing is to start by >>>>> making a branch for the removal and we can argue about who gets the short >>>>> end of the stick later... >>>>> >>>>> Manage the release yes, do the heavy lifting not necessarily:) OK, I'll >>> make the branch tonight. >>> >>> If the removal just involves the same changes that were made for 1.4.1 >>> then I can do it. But I'm not familiar with this code and if it's more work, >>> I probably won't have time for it between the scipy 0.8.0 release and my >>> 'real job', like you call it. >>> >>> >> I think it may be a bit trickier because Travis made more changes. I think >> the relevant commits are r8113..r8115 and 8107..8108. After removing those >> we still need to remove the same stuff as we did for the 1.4.1 release. We >> will need some way of testing if the removal was successful. >> > > That still looks like it's not an insane amount of work. > > We probably want to make sure there is a documentation update also, maybe >> before making the branch. >> >> I checked and there's not too much to merge, most of the changes ( > http://docs.scipy.org/numpy/patch/) don't apply cleanly or at all. The > latter because they're docs for constants, lists, etc. > > The biggest chuck of recent changes is for the polynomial and chebyshev > docs, can you give your opinion on those Charles? The OK to Apply is set to > True for all of them, but I'm not sure who did that. I'll go have a look. David (G) worked with me off-line so I expect they are generally OK. The markup seems a bit excessive to me, but that is probably a tradeoff between the appearance of the generated docs and the terminal help. I don't know what is standard practice there these days. > The patch generation won't work for many of those docs, so if you could > check if any docs should be merged manually right now that would be useful. > > The {Chebyshev, Polynomial} documentation that has conflicts can't be edited on the web, as they reference documents generated from a template at module load time. I'll fix that at some point, but at the moment it is easier to maintain one template than to maintain two separate versions. I'll hand merge anything there that looks appropriate. @ David G: there are some conflicts in docs you recently edited, > http://docs.scipy.org/numpy/merge/. Would you mind resolving those? > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Mon May 31 12:59:59 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 31 May 2010 09:59:59 -0700 Subject: [Numpy-discussion] Introduction to Scott, Jason, and (possibly) others from Enthought In-Reply-To: References: <2D706A49-3574-4E5D-81CD-B89B5E6BD0B5@enthought.com> <11E425D5-D34A-44FC-A0B1-6D2641D522EA@enthought.com> Message-ID: On Mon, May 31, 2010 at 3:54 AM, Ralf Gommers wrote: > > > On Mon, May 31, 2010 at 8:23 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, May 30, 2010 at 5:53 PM, Ralf Gommers < >> ralf.gommers at googlemail.com> wrote: >> >>> >>> >>> On Mon, May 31, 2010 at 2:06 AM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> Hey, I thought that was your job ;) Maybe the best thing is to start by >>>>> making a branch for the removal and we can argue about who gets the short >>>>> end of the stick later... >>>>> >>>>> Manage the release yes, do the heavy lifting not necessarily:) OK, I'll >>> make the branch tonight. >>> >>> If the removal just involves the same changes that were made for 1.4.1 >>> then I can do it. But I'm not familiar with this code and if it's more work, >>> I probably won't have time for it between the scipy 0.8.0 release and my >>> 'real job', like you call it. >>> >>> >> I think it may be a bit trickier because Travis made more changes. I think >> the relevant commits are r8113..r8115 and 8107..8108. After removing those >> we still need to remove the same stuff as we did for the 1.4.1 release. We >> will need some way of testing if the removal was successful. >> > > That still looks like it's not an insane amount of work. > > We probably want to make sure there is a documentation update also, maybe >> before making the branch. >> >> I checked and there's not too much to merge, most of the changes ( > http://docs.scipy.org/numpy/patch/) don't apply cleanly or at all. The > latter because they're docs for constants, lists, etc. > > The biggest chuck of recent changes is for the polynomial and chebyshev > docs, can you give your opinion on those Charles? The OK to Apply is set to > True for all of them, but I'm not sure who did that. The patch generation > won't work for many of those docs, so if you could check if any docs should > be merged manually right now that would be useful. > > @ David G: there are some conflicts in docs you recently edited, > http://docs.scipy.org/numpy/merge/. Would you mind resolving those? > Nothing to resolve: those docstrings are generated via a template and should not be modified via the wiki at all - any differences between what's in the wiki and svn are inadvertent and can be ignored. DG > > Thanks, > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Mathematician: noun, someone who disavows certainty when their uncertainty set is non-empty, even if that set has measure zero. Hope: noun, that delusive spirit which escaped Pandora's jar and, with her lies, prevents mankind from committing a general suicide. (As interpreted by Robert Graves) -------------- next part -------------- An HTML attachment was scrubbed... URL: