From kwgoodman at gmail.com Wed Dec 1 12:08:13 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 1 Dec 2010 09:08:13 -0800 Subject: [SciPy-User] [ANN] Bottleneck 0.1 Message-ID: This is the first release of Bottleneck, a collection of fast, NumPy array functions written in Cython. The three categories of Bottleneck functions: - Faster replacement for NumPy and SciPy functions - Moving window functions - Group functions that bin calculations by like-labeled elements Function signatures (using nanmean as an example): Functions nanmean(arr, axis=None) Moving window move_mean(arr, window, axis=0) Group by group_nanmean(arr, label, order=None, axis=0) Let's give it a try. Create a NumPy array: >>> import numpy as np >>> arr = np.array([1, 2, np.nan, 4, 5]) Find the nanmean: >>> import bottleneck as bn >>> bn.nanmean(arr) 3.0 Moving window nanmean: >>> bn.move_nanmean(arr, window=2) array([ nan, 1.5, 2. , 4. , 4.5]) Group nanmean: >>> label = ['a', 'a', 'b', 'b', 'a'] >>> bn.group_nanmean(arr, label) (array([ 2.66666667, 4. ]), ['a', 'b']) Fast ==== Bottleneck is fast: >>> arr = np.random.rand(100, 100) >>> timeit np.nanmax(arr) 10000 loops, best of 3: 99.6 us per loop >>> timeit bn.nanmax(arr) 100000 loops, best of 3: 15.3 us per loop Let's not forget to add some NaNs: >>> arr[arr > 0.5] = np.nan >>> timeit np.nanmax(arr) 10000 loops, best of 3: 146 us per loop >>> timeit bn.nanmax(arr) 100000 loops, best of 3: 15.2 us per loop Bottleneck comes with a benchmark suite that compares the performance of the bottleneck functions that have a NumPy/SciPy equivalent. To run the benchmark: >>> bn.benchit(verbose=False) Bottleneck performance benchmark Bottleneck 0.1.0dev Numpy 1.5.1 Scipy 0.8.0 Speed is numpy (or scipy) time divided by Bottleneck time NaN means all NaNs Speed Test Shape dtype NaN? 2.4019 median(a, axis=-1) (500,500) float64 2.2668 median(a, axis=-1) (500,500) float64 NaN 4.1235 median(a, axis=-1) (10000,) float64 4.3498 median(a, axis=-1) (10000,) float64 NaN 9.8184 nanmax(a, axis=-1) (500,500) float64 7.9157 nanmax(a, axis=-1) (500,500) float64 NaN 9.2306 nanmax(a, axis=-1) (10000,) float64 8.1635 nanmax(a, axis=-1) (10000,) float64 NaN 6.7218 nanmin(a, axis=-1) (500,500) float64 7.9112 nanmin(a, axis=-1) (500,500) float64 NaN 6.4950 nanmin(a, axis=-1) (10000,) float64 8.0791 nanmin(a, axis=-1) (10000,) float64 NaN 12.3650 nanmean(a, axis=-1) (500,500) float64 42.0738 nanmean(a, axis=-1) (500,500) float64 NaN 12.2769 nanmean(a, axis=-1) (10000,) float64 22.1285 nanmean(a, axis=-1) (10000,) float64 NaN 9.5515 nanstd(a, axis=-1) (500,500) float64 68.9192 nanstd(a, axis=-1) (500,500) float64 NaN 9.2174 nanstd(a, axis=-1) (10000,) float64 26.1753 nanstd(a, axis=-1) (10000,) float64 NaN Faster ====== Under the hood Bottleneck uses a separate Cython function for each combination of ndim, dtype, and axis. A lot of the overhead in bn.nanmax(), for example, is in checking that the axis is within range, converting non-array data to an array, and selecting the function to use to calculate the maximum. You can get rid of the overhead by doing all this before you, say, enter an inner loop: >>> arr = np.random.rand(10,10) >>> func, a = bn.func.nanmax_selector(arr, axis=0) >>> func Let's see how much faster than runs: >> timeit np.nanmax(arr, axis=0) 10000 loops, best of 3: 25.7 us per loop >> timeit bn.nanmax(arr, axis=0) 100000 loops, best of 3: 5.25 us per loop >> timeit func(a) 100000 loops, best of 3: 2.5 us per loop Note that func is faster than Numpy's non-NaN version of max: >> timeit arr.max(axis=0) 100000 loops, best of 3: 3.28 us per loop So adding NaN protection to your inner loops comes at a negative cost! Functions ========= Bottleneck is in the prototype stage. Bottleneck contains the following functions: median nanmean nanvar nanstd nanmin nanmax move_nanmean group_nanmean Currently only 1d, 2d, and 3d NumPy arrays with dtype int32, int64, and float64 are supported. License ======= Bottleneck is distributed under a Simplified BSD license. Parts of NumPy, Scipy and numpydoc, all of which have BSD licenses, are included in Bottleneck. See the LICENSE file, which is distributed with Bottleneck, for details. URLs ==== download http://pypi.python.org/pypi/Bottleneck docs http://berkeleyanalytics.com/bottleneck code http://github.com/kwgoodman/bottleneck mailing list http://groups.google.com/group/bottle-neck Install ======= Requirements: Bottleneck Python, NumPy 1.5.1+, SciPy 0.8.0+ Unit tests nose Compile gcc or MinGW **GNU/Linux, Mac OS X, et al.** To install Bottleneck: $ python setup.py build $ sudo python setup.py install Or, if you wish to specify where Bottleneck is installed, for example inside /usr/local: $ python setup.py build $ sudo python setup.py install --prefix=/usr/local **Windows** In order to compile the C code in dsna you need a Windows version of the gcc compiler. MinGW (Minimalist GNU for Windows) contains gcc and has been used to successfully compile dsna on Windows. Install MinGW and add it to your system path. Then install dsna with the commands: python setup.py build --compiler=mingw32 python setup.py install **Post install** After you have installed Bottleneck, run the suite of unit tests: >>> import bottleneck as bn >>> bn.test() Ran 10 tests in 13.756s OK From pauloa.herrera at gmail.com Wed Dec 1 13:15:19 2010 From: pauloa.herrera at gmail.com (Paulo Herrera) Date: Wed, 1 Dec 2010 19:15:19 +0100 Subject: [SciPy-User] Announcement: Self-contained Python module to write binary VTK files. In-Reply-To: References: <0F37073C-2AE8-4C65-A254-0943317B8FF1@gmail.com> <826B8A8A-0C10-4FF8-BA13-096E79999BB6@gmail.com> Message-ID: Hi, I just changed the license of the files in the repository to a FreeBSD license. I hope this will make easier to use the module. Paulo On Tue, Nov 30, 2010 at 9:19 AM, Matthew Brett wrote: > Hi, > >>>> PyEVTK is released under the GPL 3 open source license. A copy of the license is >>>> included in the src directory. >>> >>> Would you consider changing to a more permissive license? ? We >>> (nipy.org) would have good use of your package, I believe, but we're >>> using the BSD license. >> >> I'd like to release it with a license that is compatible with the GPL license. It seems that the FreeBSD license satisfies that requirement (http://en.wikipedia.org/wiki/BSD_licenses). Would the FreeBSD be useful for you? > > That's great - thank you. ?We use the 3-clause BSD license mainly [1], > and the MIT license in one project, but the 2-clause 'simplified' BSD > that FreeBSD uses is ideal. > > Thanks again, > > Matthew > > [1] http://www.opensource.org/licenses/bsd-license.php > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From tjhnson at gmail.com Wed Dec 1 18:49:19 2010 From: tjhnson at gmail.com (T J) Date: Wed, 1 Dec 2010 15:49:19 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> <1291167742.3733.3.camel@Portable-s2m.cnrs-mrs.fr> <1291171140.3733.6.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: On Tue, Nov 30, 2010 at 7:04 PM, Keith Goodman wrote: > I bumped the Bottleneck requirements from "NumPy, SciPy" to "NumPy > 1.5.1+, SciPy 0.8.0+". I think that is fair to do for a brand new > project. If SciPy is only used in the benchmarks/tests, then why not make it an optional benchmark/test that runs only if SciPy is present? nose.SkipTest should be useful here. I frequently run software on machines that only have NumPy installed. From kwgoodman at gmail.com Wed Dec 1 19:09:10 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 1 Dec 2010 16:09:10 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> <1291167742.3733.3.camel@Portable-s2m.cnrs-mrs.fr> <1291171140.3733.6.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: On Wed, Dec 1, 2010 at 3:49 PM, T J wrote: > On Tue, Nov 30, 2010 at 7:04 PM, Keith Goodman wrote: >> I bumped the Bottleneck requirements from "NumPy, SciPy" to "NumPy >> 1.5.1+, SciPy 0.8.0+". I think that is fair to do for a brand new >> project. > > If SciPy is only used in the benchmarks/tests, then why not make it an > optional benchmark/test that runs only if SciPy is present? > nose.SkipTest should be useful here. ?I frequently run software on > machines that only have NumPy installed. Seems like a strange discussion to have on the scipy list :) I don't want to have a hole in my unit test coverage. But I could copy over the nan functions in scipy stats. And I guess the benchmark could use those too. And then skip moving window benchmarks against scipy.ndimage for those who don't have scipy installed. From Chris.Barker at noaa.gov Wed Dec 1 19:19:04 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 01 Dec 2010 16:19:04 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> <1291167742.3733.3.camel@Portable-s2m.cnrs-mrs.fr> <1291171140.3733.6.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: <4CF6E5F8.5030403@noaa.gov> On 12/1/10 4:09 PM, Keith Goodman wrote: >> I frequently run software on >> machines that only have NumPy installed. > > Seems like a strange discussion to have on the scipy list :) True -- and yet I didn't have scipy on this machine yet, either... > I don't want to have a hole in my unit test coverage. But I could copy > over the nan functions in scipy stats. And I guess the benchmark could > use those too. And then skip moving window benchmarks against > scipy.ndimage for those who don't have scipy installed. I'd vote to have unit tests that don't require scipy, but I think it's fine that the benchmarks do -- that's kind of the point of them -- comparing bottleneck to the raw scipy functions. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From kwgoodman at gmail.com Wed Dec 1 19:21:33 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 1 Dec 2010 16:21:33 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: <4CF6E5F8.5030403@noaa.gov> References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> <1291167742.3733.3.camel@Portable-s2m.cnrs-mrs.fr> <1291171140.3733.6.camel@Portable-s2m.cnrs-mrs.fr> <4CF6E5F8.5030403@noaa.gov> Message-ID: On Wed, Dec 1, 2010 at 4:19 PM, Christopher Barker wrote: > On 12/1/10 4:09 PM, Keith Goodman wrote: >>> I frequently run software on >>> machines that only have NumPy installed. >> >> Seems like a strange discussion to have on the scipy list :) > > True -- and yet I didn't have scipy on this machine yet, either... > >> I don't want to have a hole in my unit test coverage. But I could copy >> over the nan functions in scipy stats. And I guess the benchmark could >> use those too. And then skip moving window benchmarks against >> scipy.ndimage for those who don't have scipy installed. > > I'd vote to have unit tests that don't require scipy, but I think it's > fine that the benchmarks do -- that's kind of the point of them -- > comparing bottleneck to the raw scipy functions. Well, now I have a most requested feature. OK, I'll do it for 0.2. From bsder at allcaps.org Thu Dec 2 11:17:20 2010 From: bsder at allcaps.org (Andrew Lentvorski) Date: Thu, 02 Dec 2010 08:17:20 -0800 Subject: [SciPy-User] Accurate Frequency Measurement In-Reply-To: <4CF4A0D1.4040709@silveregg.co.jp> References: <20101130014157.GA9408@spirit> <4CF4A0D1.4040709@silveregg.co.jp> Message-ID: <4CF7C690.2080400@allcaps.org> On 11/29/10 10:59 PM, David wrote: > You may want to look at something like CLAM (http://clam-project.org) to > analyse those signals if you want to track frequency changes. I believe > they have some python bindings. That looks like a really nice project, but it looks really dead. Is there anything that uses CLAM that is up to date? -a From ptittmann at gmail.com Thu Dec 2 14:55:57 2010 From: ptittmann at gmail.com (Peter Tittmann) Date: Thu, 2 Dec 2010 11:55:57 -0800 Subject: [SciPy-User] ancova with optimize.curve_fit Message-ID: Greetings, Im attempting to conduct analysis of covariance (ANCOVA) using a non-linear regression with curve_fit in optimize. The doc string states: "xdata : An N-length sequence or an (k,N)-shaped array for functions with k predictors. The independent variable where the data is measured." I am hoping that this means that if I pass the independent variable and a categorical variable, the resulting covariance matrix will reflect the variability in the equation coefficients among the categorical variables. 1. Is tis the case? 2. If so, i'm having a problem with the input array for xdata. The following extracts data from a relational database (thats the sql). the eqCoeff() function works fine, however when I add a second dimension to the xdata in th ancova() function (indHtPlot), curve fit produces an error which seems to be related to the structure of my input array. I've tried column_stack and vstack to form the arrays. Any assistance would be gratefully received. import birdseye_db as db import numpy as np from scipy.optimize import curve_fit def getDiam(ht, a, b): dbh = a * ht**b return dbh def eqCoeff(): '''estimates coefficients a and b in dbh= a* h**b using all trees where height was measured''' species=[i[0].strip(' ') for i in db.query('select distinct species from plots')] res3d=db.query('select dbh, height, species from plots where ht_code=1') indHt=[i[1] for i in res3d] depDbh=[i[0] for i in res3d] estimated_params, err_est = curve_fit(getDiam, indHt, depDbh) return estimated_params, err_est def ancova(): res=db.query('select dbh, height, plot, species from plots where ht_code=1') indHtPlot= np.column_stack(([i[1] for i in res],[i[2] for i in res] )) depDbh=[i[0] for i in res] estimated_params, err_est = curve_fit(getDiam, indHtPlot, depDbh) return estimated_params, err_est Thanks in advance -- Peter Tittmann -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Dec 2 16:01:19 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 2 Dec 2010 16:01:19 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: Message-ID: On Thu, Dec 2, 2010 at 2:55 PM, Peter Tittmann wrote: > Greetings, > Im attempting to conduct analysis of covariance (ANCOVA) using a non-linear > regression with curve_fit in optimize. The doc string states: > "xdata : An N-length sequence or an (k,N)-shaped array for functions with k > predictors. The independent variable where the data is measured." > I am hoping that this means that if I pass the independent variable and a > categorical variable, the resulting covariance matrix will reflect the > variability in the equation coefficients among the categorical variables. > 1. Is tis the case? > 2. If so, i'm having a problem with the input array for xdata. The following > extracts data from a relational database (thats the sql). the eqCoeff() > function works fine, however when I add a second dimension to the xdata in > th ancova() function (indHtPlot), curve fit produces an error which seems to > be related to the structure of my input array. I've tried column_stack and > vstack to form the arrays. Any assistance would be gratefully received. > > import birdseye_db as db > import numpy as np > from scipy.optimize import curve_fit > def getDiam(ht, a, b): > ?? ?dbh = a * ht**b > ?? ?return dbh > def eqCoeff(): > ?? ?'''estimates coefficients a and b in dbh= a* h**b using all trees where > height was measured''' > ?? ?species=[i[0].strip(' ') for i in db.query('select distinct species from > plots')] > ?? ?res3d=db.query('select dbh, height, species from plots where ht_code=1') > ?? ?indHt=[i[1] for i in res3d] > ?? ?depDbh=[i[0] for i in res3d] > ?? ?estimated_params, err_est = curve_fit(getDiam, indHt, depDbh) > ?? ?return estimated_params, err_est > > def ancova(): > ?? ?res=db.query('select dbh, height, plot, species from plots where > ht_code=1') > ?? ?indHtPlot= np.column_stack(([i[1] for i in res],[i[2] for i in res] )) > ?? ?depDbh=[i[0] for i in res] > ?? ?estimated_params, err_est = curve_fit(getDiam, indHtPlot, depDbh) > ?? ?return estimated_params, err_est Can you post the actual traceback? My first guess, but I have to look it up, is that you need to transpose xdata, e.g. indHtPlot.T curve_fit(getDiam, indHtPlot.T, depDbh) Josef > Thanks in advance > -- > Peter Tittmann > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From ptittmann at gmail.com Thu Dec 2 16:10:33 2010 From: ptittmann at gmail.com (Peter Tittmann) Date: Thu, 2 Dec 2010 13:10:33 -0800 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: Message-ID: <0DD9408C0B1B40B99E1863F77126D466@gmail.com> Thanks very much for your reply Josef, here are the tracebacks from both the original and from your suggestion: with: curve_fit(getDiam, indHtPlot, depDbh) In [20]: ancova() ------------------------------------------------------------ Traceback (most recent call last): File "", line 1, in File "", line 5, in ancova File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", line 422, in curve_fit res = leastsq(func, p0, args=args, full_output=1, **kw) File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", line 273, in leastsq m = check_func(func,x0,args,n)[0] File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", line 13, in check_func res = atleast_1d(thefunc(*((x0[:numinputs],)+args))) File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", line 343, in _general_function return function(xdata, *params) - ydata ValueError: shape mismatch: objects cannot be broadcast to a single shape with curve_fit(getDiam, indHtPlot.T, depDbh) In [22]: ancova() Error in sys.excepthook: Traceback (most recent call last): File "/usr/lib/pymodules/python2.6/IPython/iplib.py", line 2028, in excepthook self.showtraceback((etype,value,tb),tb_offset=0) File "/usr/lib/pymodules/python2.6/IPython/iplib.py", line 1729, in showtraceback self.InteractiveTB(etype,value,tb,tb_offset=tb_offset) File "/usr/lib/pymodules/python2.6/IPython/ultraTB.py", line 998, in __call__ print >> out, self.text(etype, evalue, etb) File "/usr/lib/pymodules/python2.6/IPython/ultraTB.py", line 1012, in text return FormattedTB.text(self,etype,value,tb,context=5,mode=mode) File "/usr/lib/pymodules/python2.6/IPython/ultraTB.py", line 937, in text if len(elist) > self.tb_offset: TypeError: object of type 'NoneType' has no len() Original exception was: ValueError: object too deep for desired array ------------------------------------------------------------ Traceback (most recent call last): File "/usr/lib/pymodules/python2.6/IPython/iplib.py", line 2257, in runcode exec code_obj in self.user_global_ns, self.user_ns File "", line 1, in File "", line 5, in ancova File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", line 422, in curve_fit res = leastsq(func, p0, args=args, full_output=1, **kw) File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", line 281, in leastsq maxfev, epsfcn, factor, diag) error: Result from function call is not a proper array of floats. -- Peter Tittmann On Thursday, December 2, 2010 at 1:01 PM, josef.pktd at gmail.com wrote: > On Thu, Dec 2, 2010 at 2:55 PM, Peter Tittmann wrote: > > > Greetings, > > Im attempting to conduct analysis of covariance (ANCOVA) using a non-linear > > regression with curve_fit in optimize. The doc string states: > > "xdata : An N-length sequence or an (k,N)-shaped array for functions with k > > predictors. The independent variable where the data is measured." > > I am hoping that this means that if I pass the independent variable and a > > categorical variable, the resulting covariance matrix will reflect the > > variability in the equation coefficients among the categorical variables. > > 1. Is tis the case? > > 2. If so, i'm having a problem with the input array for xdata. The following > > extracts data from a relational database (thats the sql). the eqCoeff() > > function works fine, however when I add a second dimension to the xdata in > > th ancova() function (indHtPlot), curve fit produces an error which seems to > > be related to the structure of my input array. I've tried column_stack and > > vstack to form the arrays. Any assistance would be gratefully received. > > > > import birdseye_db as db > > import numpy as np > > from scipy.optimize import curve_fit > > def getDiam(ht, a, b): > > dbh = a * ht**b > > return dbh > > def eqCoeff(): > > '''estimates coefficients a and b in dbh= a* h**b using all trees where > > height was measured''' > > species=[i[0].strip(' ') for i in db.query('select distinct species from > > plots')] > > res3d=db.query('select dbh, height, species from plots where ht_code=1') > > indHt=[i[1] for i in res3d] > > depDbh=[i[0] for i in res3d] > > estimated_params, err_est = curve_fit(getDiam, indHt, depDbh) > > return estimated_params, err_est > > > > def ancova(): > > res=db.query('select dbh, height, plot, species from plots where > > ht_code=1') > > indHtPlot= np.column_stack(([i[1] for i in res],[i[2] for i in res] )) > > depDbh=[i[0] for i in res] > > estimated_params, err_est = curve_fit(getDiam, indHtPlot, depDbh) > > return estimated_params, err_est > > > > > > > Can you post the actual traceback? > > My first guess, but I have to look it up, is that you need to > transpose xdata, e.g. indHtPlot.T > > curve_fit(getDiam, indHtPlot.T, depDbh) > > Josef > > > > > Thanks in advance > > -- > > Peter Tittmann > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Thu Dec 2 16:31:37 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 2 Dec 2010 16:31:37 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: <0DD9408C0B1B40B99E1863F77126D466@gmail.com> References: <0DD9408C0B1B40B99E1863F77126D466@gmail.com> Message-ID: On Thu, Dec 2, 2010 at 4:10 PM, Peter Tittmann wrote: > Thanks very much for your reply Josef, here are the tracebacks from both the > original and from your suggestion: > with:?curve_fit(getDiam, indHtPlot, depDbh) > In [20]: ancova() > ------------------------------------------------------------ > Traceback (most recent call last): > ??File "", line 1, in > ??File "", line 5, in ancova > ??File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", > line 422, in curve_fit > ?? ?res = leastsq(func, p0, args=args, full_output=1, **kw) > ??File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", > line 273, in leastsq > ?? ?m = check_func(func,x0,args,n)[0] > ??File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", > line 13, in check_func > ?? ?res = atleast_1d(thefunc(*((x0[:numinputs],)+args))) > ??File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", > line 343, in _general_function > ?? ?return function(xdata, *params) - ydata > ValueError: shape mismatch: objects cannot be broadcast to a single shape > > with?curve_fit(getDiam, indHtPlot.T, depDbh) > In [22]: ancova() > Error in sys.excepthook: > Traceback (most recent call last): > ??File "/usr/lib/pymodules/python2.6/IPython/iplib.py", line 2028, in > excepthook > ?? ?self.showtraceback((etype,value,tb),tb_offset=0) > ??File "/usr/lib/pymodules/python2.6/IPython/iplib.py", line 1729, in > showtraceback > ?? ?self.InteractiveTB(etype,value,tb,tb_offset=tb_offset) > ??File "/usr/lib/pymodules/python2.6/IPython/ultraTB.py", line 998, in > __call__ > ?? ?print >> out, self.text(etype, evalue, etb) > ??File "/usr/lib/pymodules/python2.6/IPython/ultraTB.py", line 1012, in text > ?? ?return FormattedTB.text(self,etype,value,tb,context=5,mode=mode) > ??File "/usr/lib/pymodules/python2.6/IPython/ultraTB.py", line 937, in text > ?? ?if len(elist) > self.tb_offset: > TypeError: object of type 'NoneType' has no len() > Original exception was: > ValueError: object too deep for desired array > ------------------------------------------------------------ > Traceback (most recent call last): > ??File "/usr/lib/pymodules/python2.6/IPython/iplib.py", line 2257, in > runcode > ?? ?exec code_obj in self.user_global_ns, self.user_ns > ??File "", line 1, in > ??File "", line 5, in ancova > ??File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", > line 422, in curve_fit > ?? ?res = leastsq(func, p0, args=args, full_output=1, **kw) > ??File "/usr/local/lib/python2.6/dist-packages/scipy/optimize/minpack.py", > line 281, in leastsq > ?? ?maxfev, epsfcn, factor, diag) > error: Result from function call is not a proper array of floats. > > -- > Peter Tittmann > > On Thursday, December 2, 2010 at 1:01 PM, josef.pktd at gmail.com wrote: > > On Thu, Dec 2, 2010 at 2:55 PM, Peter Tittmann wrote: > > Greetings, > Im attempting to conduct analysis of covariance (ANCOVA) using a non-linear > regression with curve_fit in optimize. The doc string states: > "xdata : An N-length sequence or an (k,N)-shaped array for functions with k > predictors. The independent variable where the data is measured." > I am hoping that this means that if I pass the independent variable and a > categorical variable, the resulting covariance matrix will reflect the > variability in the equation coefficients among the categorical variables. > 1. Is tis the case? > 2. If so, i'm having a problem with the input array for xdata. The following > extracts data from a relational database (thats the sql). the eqCoeff() > function works fine, however when I add a second dimension to the xdata in > th ancova() function (indHtPlot), curve fit produces an error which seems to > be related to the structure of my input array. I've tried column_stack and > vstack to form the arrays. Any assistance would be gratefully received. > > import birdseye_db as db > import numpy as np > from scipy.optimize import curve_fit > def getDiam(ht, a, b): > ?? ?dbh = a * ht**b > ?? ?return dbh > def eqCoeff(): > ?? ?'''estimates coefficients a and b in dbh= a* h**b using all trees where > height was measured''' > ?? ?species=[i[0].strip(' ') for i in db.query('select distinct species from > plots')] > ?? ?res3d=db.query('select dbh, height, species from plots where ht_code=1') > ?? ?indHt=[i[1] for i in res3d] > ?? ?depDbh=[i[0] for i in res3d] > ?? ?estimated_params, err_est = curve_fit(getDiam, indHt, depDbh) > ?? ?return estimated_params, err_est > > def ancova(): > ?? ?res=db.query('select dbh, height, plot, species from plots where > ht_code=1') > ?? ?indHtPlot= np.column_stack(([i[1] for i in res],[i[2] for i in res] )) > ?? ?depDbh=[i[0] for i in res] > ?? ?estimated_params, err_est = curve_fit(getDiam, indHtPlot, depDbh) > ?? ?return estimated_params, err_est > > > Can you post the actual traceback? > > My first guess, but I have to look it up, is that you need to > transpose xdata, e.g. indHtPlot.T > > curve_fit(getDiam, indHtPlot.T, depDbh) > > Josef > > > Thanks in advance > -- > Peter Tittmann > Can you post a small sample of your data to replicate? Skipper From josef.pktd at gmail.com Thu Dec 2 16:33:51 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 2 Dec 2010 16:33:51 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: Message-ID: On Thu, Dec 2, 2010 at 2:55 PM, Peter Tittmann wrote: > Greetings, > Im attempting to conduct analysis of covariance (ANCOVA) using a non-linear > regression with curve_fit in optimize. The doc string states: > "xdata : An N-length sequence or an (k,N)-shaped array for functions with k > predictors. The independent variable where the data is measured." > I am hoping that this means that if I pass the independent variable and a > categorical variable, the resulting covariance matrix will reflect the > variability in the equation coefficients among the categorical variables. > 1. Is tis the case? > 2. If so, i'm having a problem with the input array for xdata. The following > extracts data from a relational database (thats the sql). the eqCoeff() > function works fine, however when I add a second dimension to the xdata in > th ancova() function (indHtPlot), curve fit produces an error which seems to > be related to the structure of my input array. I've tried column_stack and > vstack to form the arrays. Any assistance would be gratefully received. > > import birdseye_db as db > import numpy as np > from scipy.optimize import curve_fit > def getDiam(ht, a, b): > ?? ?dbh = a * ht**b > ?? ?return dbh if ht is 2dimensional, then dbh is also two dimensional (n,k). there should be a reduce, e.g. sum in here so that the return is 1d. Josef > def eqCoeff(): > ?? ?'''estimates coefficients a and b in dbh= a* h**b using all trees where > height was measured''' > ?? ?species=[i[0].strip(' ') for i in db.query('select distinct species from > plots')] > ?? ?res3d=db.query('select dbh, height, species from plots where ht_code=1') > ?? ?indHt=[i[1] for i in res3d] > ?? ?depDbh=[i[0] for i in res3d] > ?? ?estimated_params, err_est = curve_fit(getDiam, indHt, depDbh) > ?? ?return estimated_params, err_est > > def ancova(): > ?? ?res=db.query('select dbh, height, plot, species from plots where > ht_code=1') > ?? ?indHtPlot= np.column_stack(([i[1] for i in res],[i[2] for i in res] )) > ?? ?depDbh=[i[0] for i in res] > ?? ?estimated_params, err_est = curve_fit(getDiam, indHtPlot, depDbh) > ?? ?return estimated_params, err_est > Thanks in advance > -- > Peter Tittmann > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From ptittmann at gmail.com Thu Dec 2 16:51:31 2010 From: ptittmann at gmail.com (Peter Tittmann) Date: Thu, 2 Dec 2010 13:51:31 -0800 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: Message-ID: here is some of the data. Josef, i'm not sure I understand your suggestion. dbh is the dependent variable and is id (actually a list). The independent variable is height and the categorical variable to test for covariance with is plot. Maybe I'm confused and am trying to do something that cant be done this way... Thanks for any further assistance.. Peter -- Peter Tittmann On Thursday, December 2, 2010 at 1:33 PM, josef.pktd at gmail.com wrote: > On Thu, Dec 2, 2010 at 2:55 PM, Peter Tittmann wrote: > > > Greetings, > > Im attempting to conduct analysis of covariance (ANCOVA) using a non-linear > > regression with curve_fit in optimize. The doc string states: > > "xdata : An N-length sequence or an (k,N)-shaped array for functions with k > > predictors. The independent variable where the data is measured." > > I am hoping that this means that if I pass the independent variable and a > > categorical variable, the resulting covariance matrix will reflect the > > variability in the equation coefficients among the categorical variables. > > 1. Is tis the case? > > 2. If so, i'm having a problem with the input array for xdata. The following > > extracts data from a relational database (thats the sql). the eqCoeff() > > function works fine, however when I add a second dimension to the xdata in > > th ancova() function (indHtPlot), curve fit produces an error which seems to > > be related to the structure of my input array. I've tried column_stack and > > vstack to form the arrays. Any assistance would be gratefully received. > > > > import birdseye_db as db > > import numpy as np > > from scipy.optimize import curve_fit > > def getDiam(ht, a, b): > > dbh = a * ht**b > > return dbh > > > > > > if ht is 2dimensional, then dbh is also two dimensional (n,k). there > should be a reduce, e.g. sum in here so that the return is 1d. > > Josef > > > > > def eqCoeff(): > > '''estimates coefficients a and b in dbh= a* h**b using all trees where > > height was measured''' > > species=[i[0].strip(' ') for i in db.query('select distinct species from > > plots')] > > res3d=db.query('select dbh, height, species from plots where ht_code=1') > > indHt=[i[1] for i in res3d] > > depDbh=[i[0] for i in res3d] > > estimated_params, err_est = curve_fit(getDiam, indHt, depDbh) > > return estimated_params, err_est > > > > def ancova(): > > res=db.query('select dbh, height, plot, species from plots where > > ht_code=1') > > indHtPlot= np.column_stack(([i[1] for i in res],[i[2] for i in res] )) > > depDbh=[i[0] for i in res] > > estimated_params, err_est = curve_fit(getDiam, indHtPlot, depDbh) > > return estimated_params, err_est > > Thanks in advance > > -- > > Peter Tittmann > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: db_out.csv Type: application/octet-stream Size: 1738 bytes Desc: not available URL: From josef.pktd at gmail.com Thu Dec 2 17:43:16 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 2 Dec 2010 17:43:16 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: Message-ID: On Thu, Dec 2, 2010 at 4:51 PM, Peter Tittmann wrote: > here is some of the data. Josef, i'm not sure I understand your suggestion. > dbh is the dependent variable and is id (actually a list). The independent > variable is height and the categorical variable to test for covariance with > is plot. > Maybe I'm confused and am trying to do something that cant be done this > way... I don't understand what your getDiam function is supposed to do. In ancova, indHtPlot is 2d (nobs,2) with variables dbh and plot. getdiam calculates [a * dbh**b, a * plot**b] a (nobs,2) array, but it should produce instead a 1d (nobs,) array. Maybe you want to sum the functions with dbh and plot for each observation sum([a * dbh**b, a * plot**b], axis=1) This should solve the curvefit problem, but if plot is categorical, then this wouldn't be correct since it is just treated as metric variable. Maybe you want some dummy variables for plot instead. ??? Josef > Thanks for any further assistance.. > Peter > > -- > Peter Tittmann > > > On Thursday, December 2, 2010 at 1:33 PM, josef.pktd at gmail.com wrote: > > On Thu, Dec 2, 2010 at 2:55 PM, Peter Tittmann wrote: > > Greetings, > Im attempting to conduct analysis of covariance (ANCOVA) using a non-linear > regression with curve_fit in optimize. The doc string states: > "xdata : An N-length sequence or an (k,N)-shaped array for functions with k > predictors. The independent variable where the data is measured." > I am hoping that this means that if I pass the independent variable and a > categorical variable, the resulting covariance matrix will reflect the > variability in the equation coefficients among the categorical variables. > 1. Is tis the case? > 2. If so, i'm having a problem with the input array for xdata. The following > extracts data from a relational database (thats the sql). the eqCoeff() > function works fine, however when I add a second dimension to the xdata in > th ancova() function (indHtPlot), curve fit produces an error which seems to > be related to the structure of my input array. I've tried column_stack and > vstack to form the arrays. Any assistance would be gratefully received. > > import birdseye_db as db > import numpy as np > from scipy.optimize import curve_fit > def getDiam(ht, a, b): > ?? ?dbh = a * ht**b > ?? ?return dbh > > if ht is 2dimensional, then dbh is also two dimensional (n,k). there > should be a reduce, e.g. sum in here so that the return is 1d. > > Josef > > > def eqCoeff(): > ?? ?'''estimates coefficients a and b in dbh= a* h**b using all trees where > height was measured''' > ?? ?species=[i[0].strip(' ') for i in db.query('select distinct species from > plots')] > ?? ?res3d=db.query('select dbh, height, species from plots where ht_code=1') > ?? ?indHt=[i[1] for i in res3d] > ?? ?depDbh=[i[0] for i in res3d] > ?? ?estimated_params, err_est = curve_fit(getDiam, indHt, depDbh) > ?? ?return estimated_params, err_est > > def ancova(): > ?? ?res=db.query('select dbh, height, plot, species from plots where > ht_code=1') > ?? ?indHtPlot= np.column_stack(([i[1] for i in res],[i[2] for i in res] )) > ?? ?depDbh=[i[0] for i in res] > ?? ?estimated_params, err_est = curve_fit(getDiam, indHtPlot, depDbh) > ?? ?return estimated_params, err_est > Thanks in advance > -- > Peter Tittmann > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From ptittmann at gmail.com Thu Dec 2 17:59:53 2010 From: ptittmann at gmail.com (Peter Tittmann) Date: Thu, 2 Dec 2010 14:59:53 -0800 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: Message-ID: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> getDiam is a predictor to get dbh from height. It works with curve_fit to find coefficients a and b given datasetset of known dbh/height pairs. You are right, what I want is dummy variables for each plot. I'll see if I can get that worked out by revising getDiam.. Thanks again On Thursday, December 2, 2010 at 2:43 PM, josef.pktd at gmail.com wrote: > On Thu, Dec 2, 2010 at 4:51 PM, Peter Tittmann wrote: > > > here is some of the data. Josef, i'm not sure I understand your suggestion. > > dbh is the dependent variable and is id (actually a list). The independent > > variable is height and the categorical variable to test for covariance with > > is plot. > > Maybe I'm confused and am trying to do something that cant be done this > > way... > > > > > > I don't understand what your getDiam function is supposed to do. In > ancova, indHtPlot is 2d (nobs,2) with variables dbh and plot. getdiam > calculates [a * dbh**b, a * plot**b] a (nobs,2) array, but it should > produce instead a 1d (nobs,) array. > Maybe you want to sum the functions with dbh and plot for each > observation sum([a * dbh**b, a * plot**b], axis=1) > > This should solve the curvefit problem, but if plot is categorical, > then this wouldn't be correct since it is just treated as metric > variable. Maybe you want some dummy variables for plot instead. ??? > > Josef > > > > > Thanks for any further assistance.. > > Peter > > > > -- > > Peter Tittmann > > > > > > On Thursday, December 2, 2010 at 1:33 PM, josef.pktd at gmail.com wrote: > > > > On Thu, Dec 2, 2010 at 2:55 PM, Peter Tittmann wrote: > > > > Greetings, > > Im attempting to conduct analysis of covariance (ANCOVA) using a non-linear > > regression with curve_fit in optimize. The doc string states: > > "xdata : An N-length sequence or an (k,N)-shaped array for functions with k > > predictors. The independent variable where the data is measured." > > I am hoping that this means that if I pass the independent variable and a > > categorical variable, the resulting covariance matrix will reflect the > > variability in the equation coefficients among the categorical variables. > > 1. Is tis the case? > > 2. If so, i'm having a problem with the input array for xdata. The following > > extracts data from a relational database (thats the sql). the eqCoeff() > > function works fine, however when I add a second dimension to the xdata in > > th ancova() function (indHtPlot), curve fit produces an error which seems to > > be related to the structure of my input array. I've tried column_stack and > > vstack to form the arrays. Any assistance would be gratefully received. > > > > import birdseye_db as db > > import numpy as np > > from scipy.optimize import curve_fit > > def getDiam(ht, a, b): > > dbh = a * ht**b > > return dbh > > > > if ht is 2dimensional, then dbh is also two dimensional (n,k). there > > should be a reduce, e.g. sum in here so that the return is 1d. > > > > Josef > > > > > > def eqCoeff(): > > '''estimates coefficients a and b in dbh= a* h**b using all trees where > > height was measured''' > > species=[i[0].strip(' ') for i in db.query('select distinct species from > > plots')] > > res3d=db.query('select dbh, height, species from plots where ht_code=1') > > indHt=[i[1] for i in res3d] > > depDbh=[i[0] for i in res3d] > > estimated_params, err_est = curve_fit(getDiam, indHt, depDbh) > > return estimated_params, err_est > > > > def ancova(): > > res=db.query('select dbh, height, plot, species from plots where > > ht_code=1') > > indHtPlot= np.column_stack(([i[1] for i in res],[i[2] for i in res] )) > > depDbh=[i[0] for i in res] > > estimated_params, err_est = curve_fit(getDiam, indHtPlot, depDbh) > > return estimated_params, err_est > > Thanks in advance > > -- > > Peter Tittmann > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > @gmail.com> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Thu Dec 2 19:11:04 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 2 Dec 2010 19:11:04 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> Message-ID: On Thu, Dec 2, 2010 at 5:59 PM, Peter Tittmann wrote: > getDiam is a predictor to get dbh from height. It works with curve_fit to > find coefficients a and b given datasetset of known dbh/height pairs. You > are right, what I want is dummy variables for each plot. I'll see if I can > get that worked out by revising getDiam.. > Thanks again > I think it would be easier to create your dummy variables before you pass it in. You might find some of the tools in statsmodels to be helpful here. We don't yet have an ANCOVA model, but you could definitely do something like the following. Not sure if it's exactly what you want, but it should give you an idea. import numpy as np import scikits.statsmodels as sm dta = np.genfromtxt('./db_out.csv', delimiter=",", names=True, dtype=None) plot_dummies, col_map = sm.tools.categorical(dta['plot'], drop=True, dictnames=True) plot_dummies will be dummy variables for all of the "plot" categories, and col_map is a map from the column number to the plot just so you can be sure you know what's what. I don't see how to use your objective function though with dummy variables. What happens if the effect of one of the plots is negative, then you run into 0 ** -1.5 == inf. You could linearize your objective function to be b*ln(ht) and do something like indHtPlot = dta['height'] depDbh = dta['dbh'] X = np.column_stack((np.log(indHtPlot), plot_dummies)) Y = np.log(depDbh) res = sm.OLS(Y,X).fit() res.params array([ 0.98933264, -1.35239293, -1.0623305 , -0.99155293, -1.33675099, -1.30657011, -1.50933751, -1.28744779, -1.43937358, -1.33805883, -1.32744257, -1.42672539, -1.35239293, -1.60585046, -1.45239093, -1.45695112, -1.34811186, -1.32658794, -1.21721715, -1.32853084, -1.45775017, -1.44460388, -2.19065236, -1.3303631 , -1.20509831, -1.37341535, -1.25746105, -1.33954972, -1.33922709, -1.247304 ]) Note that your coefficient on height is now an elasticity. I'm sure I'm missing something here, but that might help you along the way. Skipper From david at silveregg.co.jp Thu Dec 2 19:54:21 2010 From: david at silveregg.co.jp (David) Date: Fri, 03 Dec 2010 09:54:21 +0900 Subject: [SciPy-User] Accurate Frequency Measurement In-Reply-To: <4CF7C690.2080400@allcaps.org> References: <20101130014157.GA9408@spirit> <4CF4A0D1.4040709@silveregg.co.jp> <4CF7C690.2080400@allcaps.org> Message-ID: <4CF83FBD.4070002@silveregg.co.jp> On 12/03/2010 01:17 AM, Andrew Lentvorski wrote: > On 11/29/10 10:59 PM, David wrote: >> You may want to look at something like CLAM (http://clam-project.org) to >> analyse those signals if you want to track frequency changes. I believe >> they have some python bindings. > > That looks like a really nice project, but it looks really dead. I don't know about that - I know it was quite alive 1-2 years ago (and the project was already a few years old). I have not followed the project much since I left academia, though. cheers, David From jsseabold at gmail.com Thu Dec 2 19:57:38 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 2 Dec 2010 19:57:38 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> Message-ID: On Thu, Dec 2, 2010 at 7:11 PM, Skipper Seabold wrote: > On Thu, Dec 2, 2010 at 5:59 PM, Peter Tittmann wrote: >> getDiam is a predictor to get dbh from height. It works with curve_fit to >> find coefficients a and b given datasetset of known dbh/height pairs. You >> are right, what I want is dummy variables for each plot. I'll see if I can >> get that worked out by revising getDiam.. >> Thanks again >> > > I think it would be easier to create your dummy variables before you pass it in. > > You might find some of the tools in statsmodels to be helpful here. > We don't yet have an ANCOVA model, but you could definitely do > something like the following. ?Not sure if it's exactly what you want, > but it should give you an idea. > > import numpy as np > import scikits.statsmodels as sm > > dta = np.genfromtxt('./db_out.csv', delimiter=",", names=True, dtype=None) > plot_dummies, col_map = sm.tools.categorical(dta['plot'], drop=True, > dictnames=True) > > plot_dummies will be dummy variables for all of the "plot" categories, > and col_map is a map from the column number to the plot just so you > can be sure you know what's what. > > I don't see how to use your objective function though with dummy > variables. ?What happens if the effect of one of the plots is > negative, then you run into 0 ** -1.5 == inf. > If you want to do NLLS and not linearize then something like this might work and still keep the dummy variables as shift parameters def getDiam(ht, *b): return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) X = np.column_stack((indHtPlot, plot_dummies)) Y = depDbh coefs, cov = optimize.curve_fit(getDiam, X, Y, p0= [0.]*X.shape[1]) > You could linearize your objective function to be > > b*ln(ht) > > and do something like > > indHtPlot = dta['height'] > depDbh = dta['dbh'] > X = np.column_stack((np.log(indHtPlot), plot_dummies)) > Y = np.log(depDbh) > res = sm.OLS(Y,X).fit() > res.params > array([ 0.98933264, -1.35239293, -1.0623305 , -0.99155293, -1.33675099, > ? ? ? -1.30657011, -1.50933751, -1.28744779, -1.43937358, -1.33805883, > ? ? ? -1.32744257, -1.42672539, -1.35239293, -1.60585046, -1.45239093, > ? ? ? -1.45695112, -1.34811186, -1.32658794, -1.21721715, -1.32853084, > ? ? ? -1.45775017, -1.44460388, -2.19065236, -1.3303631 , -1.20509831, > ? ? ? -1.37341535, -1.25746105, -1.33954972, -1.33922709, -1.247304 ?]) > > Note that your coefficient on height is now an elasticity. ?I'm sure > I'm missing something here, but that might help you along the way. > > Skipper > From josef.pktd at gmail.com Thu Dec 2 20:03:53 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 2 Dec 2010 20:03:53 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> Message-ID: On Thu, Dec 2, 2010 at 7:57 PM, Skipper Seabold wrote: > On Thu, Dec 2, 2010 at 7:11 PM, Skipper Seabold wrote: >> On Thu, Dec 2, 2010 at 5:59 PM, Peter Tittmann wrote: >>> getDiam is a predictor to get dbh from height. It works with curve_fit to >>> find coefficients a and b given datasetset of known dbh/height pairs. You >>> are right, what I want is dummy variables for each plot. I'll see if I can >>> get that worked out by revising getDiam.. >>> Thanks again >>> >> >> I think it would be easier to create your dummy variables before you pass it in. >> >> You might find some of the tools in statsmodels to be helpful here. >> We don't yet have an ANCOVA model, but you could definitely do >> something like the following. ?Not sure if it's exactly what you want, >> but it should give you an idea. >> >> import numpy as np >> import scikits.statsmodels as sm >> >> dta = np.genfromtxt('./db_out.csv', delimiter=",", names=True, dtype=None) >> plot_dummies, col_map = sm.tools.categorical(dta['plot'], drop=True, >> dictnames=True) >> >> plot_dummies will be dummy variables for all of the "plot" categories, >> and col_map is a map from the column number to the plot just so you >> can be sure you know what's what. >> >> I don't see how to use your objective function though with dummy >> variables. ?What happens if the effect of one of the plots is >> negative, then you run into 0 ** -1.5 == inf. >> > > If you want to do NLLS and not linearize then something like this > might work and still keep the dummy variables as shift parameters > > def getDiam(ht, *b): > ? ?return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) > > X = np.column_stack((indHtPlot, plot_dummies)) > Y = depDbh > coefs, cov = optimize.curve_fit(getDiam, X, Y, p0= [0.]*X.shape[1]) In the sample file there are 11 levels of the `plot` that have only a single observation each. I tried to use onewaygls, but statsmodels.OLS doesn't work if y is a scalar. I don't know whether curvefit or optimize.leastsq will converge in this case, good starting values might be necessary. Josef > > >> You could linearize your objective function to be >> >> b*ln(ht) >> >> and do something like >> >> indHtPlot = dta['height'] >> depDbh = dta['dbh'] >> X = np.column_stack((np.log(indHtPlot), plot_dummies)) >> Y = np.log(depDbh) >> res = sm.OLS(Y,X).fit() >> res.params >> array([ 0.98933264, -1.35239293, -1.0623305 , -0.99155293, -1.33675099, >> ? ? ? -1.30657011, -1.50933751, -1.28744779, -1.43937358, -1.33805883, >> ? ? ? -1.32744257, -1.42672539, -1.35239293, -1.60585046, -1.45239093, >> ? ? ? -1.45695112, -1.34811186, -1.32658794, -1.21721715, -1.32853084, >> ? ? ? -1.45775017, -1.44460388, -2.19065236, -1.3303631 , -1.20509831, >> ? ? ? -1.37341535, -1.25746105, -1.33954972, -1.33922709, -1.247304 ?]) >> >> Note that your coefficient on height is now an elasticity. ?I'm sure >> I'm missing something here, but that might help you along the way. >> >> Skipper >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From nwagner at iam.uni-stuttgart.de Fri Dec 3 03:09:27 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 03 Dec 2010 09:09:27 +0100 Subject: [SciPy-User] scipy 0.9.0.dev6984 test failures Message-ID: ====================================================================== FAIL: test_cast_to_fp (test_recaster.TestRecaster) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/io/tests/test_recaster.py", line 73, in test_cast_to_fp 'Expected %s from %s, got %s' % (outp, inp, dtt) AssertionError: Expected from , got ====================================================================== FAIL: line-search Newton conjugate gradient optimization routine ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/optimize/tests/test_optimize.py", line 177, in test_ncg assert_(self.gradcalls == 18, self.gradcalls) # 0.8.0 File "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: 16 ====================================================================== FAIL: test_basic (test_signaltools.TestMedFilt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/signal/tests/test_signaltools.py", line 284, in test_basic [ 0, 7, 11, 7, 4, 4, 19, 19, 24, 0]]) File "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 686, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 618, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not equal (mismatch 8.0%) x: array([[ 0., 50., 50., 50., 42., 15., 15., 18., 27., 0.], [ 0., 50., 50., 50., 50., 42., 19., 21., 29., 0.], [ 50., 50., 50., 50., 50., 47., 34., 34., 46., 35.],... y: array([[ 0, 50, 50, 50, 42, 15, 15, 18, 27, 0], [ 0, 50, 50, 50, 50, 42, 19, 21, 29, 0], [50, 50, 50, 50, 50, 47, 34, 34, 46, 35],... ---------------------------------------------------------------------- Ran 4794 tests in 75.009s FAILED (KNOWNFAIL=13, SKIP=17, failures=4) From charlesr.harris at gmail.com Fri Dec 3 12:05:03 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 3 Dec 2010 10:05:03 -0700 Subject: [SciPy-User] scipy 0.9.0.dev6984 test failures In-Reply-To: References: Message-ID: On Fri, Dec 3, 2010 at 1:09 AM, Nils Wagner wrote: > > ====================================================================== > FAIL: test_cast_to_fp (test_recaster.TestRecaster) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > > "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/io/tests/test_recaster.py", > line 73, in test_cast_to_fp > 'Expected %s from %s, got %s' % (outp, inp, dtt) > AssertionError: Expected from 'numpy.float64'>, got > > Recaster is gone from the repository but I found a copy in the build directory. Try deleting the build and installation directories. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Dec 4 09:27:08 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 4 Dec 2010 09:27:08 -0500 Subject: [SciPy-User] stats.distributions moments and expect - another round Message-ID: I spend most of a day fixing the expect function for stats distributions and checking mean, variance, skew and kurtosis, and trying to get it to work for almost all distributions. expect which uses integrate quad doesn't always work well, but the results look mostly good except for some fat-tailed distributions. Below are the comparisons between the stats method of the distributions and the outcome of expect. The first failure for mean is at 4 decimals. The tables are discrepancies at two decimals: invgamma might still have some numerical integration problems, that I haven't figured out yet. Some other ones still look like incorrect formulas for skew and kurtosis. As an aside: I think, finally, that we can add the doc templates to many distributions, so we have a way of pointing out differences in parameterization and known numerical problems. For example, I'm still not sure whether dist.stats contains the correct information about whether the moments don't exist or are infinite. Josef Variance >>> print SimpleTable(var_, headers=['distname', 'diststats', 'expect']) ========================================== distname diststats expect ------------------------------------------ pareto 1.60340572053 1.57246536012 tukeylambda 0.304764722791 0.0268724542194 fatiguelife 884942.25 884940.75504 t 3.69051720817 3.66604272097 powerlaw 0.858574961358 0.0641248702779 invgamma 13.1319516251 5.5288716506 rdist 1.78002848501 0.526315790145 ------------------------------------------ Skew >>> print SimpleTable(skew, headers=['distname', 'diststats', 'expect']) =========================================== distname diststats expect ------------------------------------------- mielke 7.59540085257 6.91088104518 fisk 38.7938857832 12.9246301568 foldnorm 0.971407236222 0.202188769695 gilbrat 6.18487713863 6.17333365849 loglaplace 16.9237038681 11.2535633383 fatiguelife 0.0408718910942 3.93239048725 powerlaw -0.906960466124 -0.420181302329 ncf 40747519832.7 8.94481163856 f 1.93130205529 1.80641432186 invgamma -0.477729689377 655.942282864 ------------------------------------------- Kurtosis >>> print SimpleTable(kurt, headers=['distname', 'diststats', 'expect']) =========================================== distname diststats expect ------------------------------------------- mielke -149.405089743 362.442518204 fisk -224.659270348 2099.30241789 foldnorm 2.70517285483 -0.294828589688 tukeylambda -2.98365209914 -0.897302898918 dweibull 1.90893020344 -1.06484211833 gilbrat 110.936392176 107.859563214 loglaplace -164.332555303 1330.2395564 genpareto 14.8285714286 14.8119563403 lognorm 81.1353811489 79.6180870122 burr 112616.270172 6.21265707889 ncf -239984516633.0 13492.9902378 f 7.9138697318 7.06539159862 nct -409040.407062 0.605963897342 invgamma -2.866573514 2116889.58176 rdist -2.56785479799 -1.53846154515 ------------------------------------------- From josef.pktd at gmail.com Sat Dec 4 10:35:55 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 4 Dec 2010 10:35:55 -0500 Subject: [SciPy-User] stats.distributions moments and expect - another round In-Reply-To: References: Message-ID: On Sat, Dec 4, 2010 at 9:27 AM, wrote: > I spend most of a day fixing the expect function for stats > distributions and checking mean, variance, skew and kurtosis, and > trying to get it to work for almost all distributions. > > expect which uses integrate quad doesn't always work well, but the > results look mostly good except for some fat-tailed distributions. > > Below are the comparisons between the stats method of the > distributions and the outcome of expect. The first failure for mean is > at 4 decimals. The tables are discrepancies at two decimals: > > invgamma might still have some numerical integration problems, that I > haven't figured out yet. Some other ones still look like incorrect > formulas for skew and kurtosis. > > As an aside: I think, finally, that we can add the doc templates to > many distributions, so we have a way of pointing out differences in > parameterization and known numerical problems. For example, I'm still > not sure whether dist.stats contains the correct information about > whether the moments don't exist or are infinite. > > Josef > > > Variance > >>>> print SimpleTable(var_, headers=['distname', 'diststats', 'expect']) > ========================================== > ?distname ? ?diststats ? ? ? ? expect > ------------------------------------------ > ? pareto ? 1.60340572053 ? 1.57246536012 > tukeylambda 0.304764722791 0.0268724542194 > fatiguelife ? 884942.25 ? ? ?884940.75504 > ? ? t ? ? ?3.69051720817 ? 3.66604272097 > ?powerlaw ?0.858574961358 0.0641248702779 > ?invgamma ?13.1319516251 ? ?5.5288716506 > ? rdist ? ?1.78002848501 ? 0.526315790145 > ------------------------------------------ > > Skew > >>>> print SimpleTable(skew, headers=['distname', 'diststats', 'expect']) > =========================================== > ?distname ? ? diststats ? ? ? ? expect > ------------------------------------------- > ? mielke ? ?7.59540085257 ? 6.91088104518 > ? ?fisk ? ? 38.7938857832 ? 12.9246301568 > ?foldnorm ? 0.971407236222 ?0.202188769695 > ?gilbrat ? ?6.18487713863 ? 6.17333365849 > ?loglaplace ?16.9237038681 ? 11.2535633383 > fatiguelife 0.0408718910942 ?3.93239048725 > ?powerlaw ?-0.906960466124 -0.420181302329 > ? ?ncf ? ? ?40747519832.7 ? 8.94481163856 > ? ? f ? ? ? 1.93130205529 ? 1.80641432186 > ?invgamma ?-0.477729689377 ?655.942282864 > ------------------------------------------- > > Kurtosis > >>>> print SimpleTable(kurt, headers=['distname', 'diststats', 'expect']) > =========================================== > ?distname ? ? diststats ? ? ? ? expect > ------------------------------------------- > ? mielke ? ?-149.405089743 ?362.442518204 > ? ?fisk ? ? -224.659270348 ?2099.30241789 > ?foldnorm ? 2.70517285483 ?-0.294828589688 > tukeylambda ?-2.98365209914 -0.897302898918 > ?dweibull ? 1.90893020344 ? -1.06484211833 > ?gilbrat ? ?110.936392176 ? 107.859563214 > ?loglaplace ?-164.332555303 ? 1330.2395564 > ?genpareto ? 14.8285714286 ? 14.8119563403 > ?lognorm ? ?81.1353811489 ? 79.6180870122 > ? ?burr ? ? 112616.270172 ? 6.21265707889 > ? ?ncf ? ? -239984516633.0 ?13492.9902378 > ? ? f ? ? ? ?7.9138697318 ? 7.06539159862 > ? ?nct ? ? ?-409040.407062 ?0.605963897342 > ?invgamma ? ?-2.866573514 ? 2116889.58176 > ? rdist ? ? -2.56785479799 ?-1.53846154515 > ------------------------------------------- > one down: invgamma is correct if a>4, requirement for kurtosis to exist Josef From josef.pktd at gmail.com Sat Dec 4 11:29:15 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 4 Dec 2010 11:29:15 -0500 Subject: [SciPy-User] stats.distributions moments and expect - another round In-Reply-To: References: Message-ID: On Sat, Dec 4, 2010 at 10:35 AM, wrote: > On Sat, Dec 4, 2010 at 9:27 AM, ? wrote: >> I spend most of a day fixing the expect function for stats >> distributions and checking mean, variance, skew and kurtosis, and >> trying to get it to work for almost all distributions. >> >> expect which uses integrate quad doesn't always work well, but the >> results look mostly good except for some fat-tailed distributions. >> >> Below are the comparisons between the stats method of the >> distributions and the outcome of expect. The first failure for mean is >> at 4 decimals. The tables are discrepancies at two decimals: >> >> invgamma might still have some numerical integration problems, that I >> haven't figured out yet. Some other ones still look like incorrect >> formulas for skew and kurtosis. >> >> As an aside: I think, finally, that we can add the doc templates to >> many distributions, so we have a way of pointing out differences in >> parameterization and known numerical problems. For example, I'm still >> not sure whether dist.stats contains the correct information about >> whether the moments don't exist or are infinite. >> >> Josef >> >> >> Variance >> >>>>> print SimpleTable(var_, headers=['distname', 'diststats', 'expect']) >> ========================================== >> ?distname ? ?diststats ? ? ? ? expect >> ------------------------------------------ >> ? pareto ? 1.60340572053 ? 1.57246536012 >> tukeylambda 0.304764722791 0.0268724542194 >> fatiguelife ? 884942.25 ? ? ?884940.75504 >> ? ? t ? ? ?3.69051720817 ? 3.66604272097 >> ?powerlaw ?0.858574961358 0.0641248702779 >> ?invgamma ?13.1319516251 ? ?5.5288716506 >> ? rdist ? ?1.78002848501 ? 0.526315790145 >> ------------------------------------------ >> >> Skew >> >>>>> print SimpleTable(skew, headers=['distname', 'diststats', 'expect']) >> =========================================== >> ?distname ? ? diststats ? ? ? ? expect >> ------------------------------------------- >> ? mielke ? ?7.59540085257 ? 6.91088104518 >> ? ?fisk ? ? 38.7938857832 ? 12.9246301568 >> ?foldnorm ? 0.971407236222 ?0.202188769695 >> ?gilbrat ? ?6.18487713863 ? 6.17333365849 >> ?loglaplace ?16.9237038681 ? 11.2535633383 >> fatiguelife 0.0408718910942 ?3.93239048725 >> ?powerlaw ?-0.906960466124 -0.420181302329 >> ? ?ncf ? ? ?40747519832.7 ? 8.94481163856 >> ? ? f ? ? ? 1.93130205529 ? 1.80641432186 >> ?invgamma ?-0.477729689377 ?655.942282864 >> ------------------------------------------- >> >> Kurtosis >> >>>>> print SimpleTable(kurt, headers=['distname', 'diststats', 'expect']) >> =========================================== >> ?distname ? ? diststats ? ? ? ? expect >> ------------------------------------------- >> ? mielke ? ?-149.405089743 ?362.442518204 >> ? ?fisk ? ? -224.659270348 ?2099.30241789 >> ?foldnorm ? 2.70517285483 ?-0.294828589688 >> tukeylambda ?-2.98365209914 -0.897302898918 >> ?dweibull ? 1.90893020344 ? -1.06484211833 >> ?gilbrat ? ?110.936392176 ? 107.859563214 >> ?loglaplace ?-164.332555303 ? 1330.2395564 >> ?genpareto ? 14.8285714286 ? 14.8119563403 >> ?lognorm ? ?81.1353811489 ? 79.6180870122 >> ? ?burr ? ? 112616.270172 ? 6.21265707889 >> ? ?ncf ? ? -239984516633.0 ?13492.9902378 >> ? ? f ? ? ? ?7.9138697318 ? 7.06539159862 >> ? ?nct ? ? ?-409040.407062 ?0.605963897342 >> ?invgamma ? ?-2.866573514 ? 2116889.58176 >> ? rdist ? ? -2.56785479799 ?-1.53846154515 >> ------------------------------------------- >> > > one down: invgamma is correct if a>4, requirement for kurtosis to exist I'm giving up, staring at the kurtosis of nct and burr without the direct references looks like more fun than I have time for. I think I have a patch for ncf. Any volunteers, or references ? At least we are down to a reasonably short list that needs checking or bugfixing. Josef > > Josef > From ralf.gommers at googlemail.com Sun Dec 5 10:15:47 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 5 Dec 2010 23:15:47 +0800 Subject: [SciPy-User] scipy 0.9.0.dev6984 test failures In-Reply-To: References: Message-ID: On Fri, Dec 3, 2010 at 4:09 PM, Nils Wagner wrote: > > ====================================================================== > FAIL: line-search Newton conjugate gradient optimization > routine > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > > "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/optimize/tests/test_optimize.py", > line 177, in test_ncg > assert_(self.gradcalls == 18, self.gradcalls) # 0.8.0 > File > > "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 34, in assert_ > raise AssertionError(msg) > AssertionError: 16 > This number of calls has been changing before apparently, and now has differences between platforms or python versions. For 0.8.0 it had an issue on Windows due to == comparison with floating point numbers. Since converging faster is not exactly a bug, can we just change the comparison to <= ? > > ====================================================================== > FAIL: test_basic (test_signaltools.TestMedFilt) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > > "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/signal/tests/test_signaltools.py", > line 284, in test_basic > [ 0, 7, 11, 7, 4, 4, 19, 19, 24, 0]]) > File > > "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 686, in assert_array_equal > verbose=verbose, header='Arrays are not equal') > File > > "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 618, in assert_array_compare > raise AssertionError(msg) > AssertionError: > Arrays are not equal > > (mismatch 8.0%) > x: array([[ 0., 50., 50., 50., 42., 15., 15., > 18., 27., 0.], > [ 0., 50., 50., 50., 50., 42., 19., 21., > 29., 0.], > [ 50., 50., 50., 50., 50., 47., 34., 34., > 46., 35.],... > y: array([[ 0, 50, 50, 50, 42, 15, 15, 18, 27, 0], > [ 0, 50, 50, 50, 50, 42, 19, 21, 29, 0], > [50, 50, 50, 50, 50, 47, 34, 34, 46, 35],... > If you change the assert_array_equal calls in TestMedfilt to assert_array_almost_equal does the test pass? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Sun Dec 5 12:13:21 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sun, 05 Dec 2010 18:13:21 +0100 Subject: [SciPy-User] scipy 0.9.0.dev6984 test failures In-Reply-To: References: Message-ID: On Sun, 5 Dec 2010 23:15:47 +0800 Ralf Gommers wrote: > On Fri, Dec 3, 2010 at 4:09 PM, Nils Wagner >wrote: > >> >> ====================================================================== >> FAIL: line-search Newton conjugate gradient optimization >> routine >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> >> "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/optimize/tests/test_optimize.py", >> line 177, in test_ncg >> assert_(self.gradcalls == 18, self.gradcalls) # >>0.8.0 >> File >> >> "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", >> line 34, in assert_ >> raise AssertionError(msg) >> AssertionError: 16 >> > > This number of calls has been changing before >apparently, and now has > differences between platforms or python versions. For >0.8.0 it had an issue > on Windows due to == comparison with floating point >numbers. > > Since converging faster is not exactly a bug, can we >just change the > comparison to <= ? > > >> >> ====================================================================== >> FAIL: test_basic (test_signaltools.TestMedFilt) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> >> "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/signal/tests/test_signaltools.py", >> line 284, in test_basic >> [ 0, 7, 11, 7, 4, 4, 19, 19, 24, 0]]) >> File >> >> "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", >> line 686, in assert_array_equal >> verbose=verbose, header='Arrays are not equal') >> File >> >> "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", >> line 618, in assert_array_compare >> raise AssertionError(msg) >> AssertionError: >> Arrays are not equal >> >> (mismatch 8.0%) >> x: array([[ 0., 50., 50., 50., 42., 15., 15., >> 18., 27., 0.], >> [ 0., 50., 50., 50., 50., 42., 19., 21., >> 29., 0.], >> [ 50., 50., 50., 50., 50., 47., 34., 34., >> 46., 35.],... >> y: array([[ 0, 50, 50, 50, 42, 15, 15, 18, 27, 0], >> [ 0, 50, 50, 50, 50, 42, 19, 21, 29, 0], >> [50, 50, 50, 50, 50, 47, 34, 34, 46, 35],... >> > > If you change the assert_array_equal calls in >TestMedfilt to > assert_array_almost_equal does the test pass? > > Ralf Hi Ralf, Unfortunately, the test didn't pass. ====================================================================== FAIL: test_basic (test_signaltools.TestMedFilt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/nwagner/local/lib64/python2.6/site-packages/scipy/signal/tests/test_signaltools.py", line 284, in test_basic [ 0, 7, 11, 7, 4, 4, 19, 19, 24, 0]]) File "/home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/utils.py", line 774, in assert_array_almost_equal header='Arrays are not almost equal') File "/home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/utils.py", line 618, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal (mismatch 8.0%) x: array([[ 0., 50., 50., 50., 42., 15., 15., 18., 27., 0.], [ 0., 50., 50., 50., 50., 42., 19., 21., 29., 0.], [ 50., 50., 50., 50., 50., 47., 34., 34., 46., 35.],... y: array([[ 0, 50, 50, 50, 42, 15, 15, 18, 27, 0], [ 0, 50, 50, 50, 50, 42, 19, 21, 29, 0], [50, 50, 50, 50, 50, 47, 34, 34, 46, 35],... ---------------------------------------------------------------------- Ran 312 tests in 3.010s FAILED (failures=1) Nils From ralf.gommers at googlemail.com Sun Dec 5 19:13:08 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 6 Dec 2010 08:13:08 +0800 Subject: [SciPy-User] scipy 0.9.0.dev6984 test failures In-Reply-To: References: Message-ID: On Mon, Dec 6, 2010 at 1:13 AM, Nils Wagner wrote: > On Sun, 5 Dec 2010 23:15:47 +0800 > Ralf Gommers wrote: > > On Fri, Dec 3, 2010 at 4:09 PM, Nils Wagner > >wrote: > > > >> > >> ====================================================================== > >> FAIL: line-search Newton conjugate gradient optimization > >> routine > >> ---------------------------------------------------------------------- > >> Traceback (most recent call last): > >> File > >> > >> > "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/optimize/tests/test_optimize.py", > >> line 177, in test_ncg > >> assert_(self.gradcalls == 18, self.gradcalls) # > >>0.8.0 > >> File > >> > >> > "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", > >> line 34, in assert_ > >> raise AssertionError(msg) > >> AssertionError: 16 > >> > > > > This number of calls has been changing before > >apparently, and now has > > differences between platforms or python versions. For > >0.8.0 it had an issue > > on Windows due to == comparison with floating point > >numbers. > > > > Since converging faster is not exactly a bug, can we > >just change the > > comparison to <= ? > > > > > >> > >> ====================================================================== > >> FAIL: test_basic (test_signaltools.TestMedFilt) > >> ---------------------------------------------------------------------- > >> Traceback (most recent call last): > >> File > >> > >> > "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/signal/tests/test_signaltools.py", > >> line 284, in test_basic > >> [ 0, 7, 11, 7, 4, 4, 19, 19, 24, 0]]) > >> File > >> > >> > "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", > >> line 686, in assert_array_equal > >> verbose=verbose, header='Arrays are not equal') > >> File > >> > >> > "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", > >> line 618, in assert_array_compare > >> raise AssertionError(msg) > >> AssertionError: > >> Arrays are not equal > >> > >> (mismatch 8.0%) > >> x: array([[ 0., 50., 50., 50., 42., 15., 15., > >> 18., 27., 0.], > >> [ 0., 50., 50., 50., 50., 42., 19., 21., > >> 29., 0.], > >> [ 50., 50., 50., 50., 50., 47., 34., 34., > >> 46., 35.],... > >> y: array([[ 0, 50, 50, 50, 42, 15, 15, 18, 27, 0], > >> [ 0, 50, 50, 50, 50, 42, 19, 21, 29, 0], > >> [50, 50, 50, 50, 50, 47, 34, 34, 46, 35],... > >> > > > > If you change the assert_array_equal calls in > >TestMedfilt to > > assert_array_almost_equal does the test pass? > > > > Ralf > > > Hi Ralf, > > Unfortunately, the test didn't pass. > > Then can you investigate a bit? Are there nans/infs in one of the outputs? The parts of the arrays that are printed look exactly the same. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From cimrman3 at ntc.zcu.cz Mon Dec 6 06:20:50 2010 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 06 Dec 2010 12:20:50 +0100 Subject: [SciPy-User] ANN: SfePy 2010.4 Message-ID: <4CFCC712.9050508@ntc.zcu.cz> I am pleased to announce release 2010.4 of SfePy. Description ----------- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method. The code is based on NumPy and SciPy packages. It is distributed under the new BSD license. Home page: http://sfepy.org Mailing lists, issue tracking: http://code.google.com/p/sfepy/ Git (source) repository: http://github.com/sfepy Documentation: http://docs.sfepy.org/doc Highlights of this release -------------------------- - higher order elements - refactoring of geometries (reference mappings) - transparent DOF vector synchronization with variables - interface variables defined on a surface region For more information on this release, see http://sfepy.googlecode.com/svn/web/releases/2010.4_RELEASE_NOTES.txt (full release notes, rather long and technical). Best regards, Robert Cimrman and Contributors (*) (*) Contributors to this release (alphabetical order): Vladim?r Luke?, Logan Sorenson, Olivier Verdier From almar.klein at gmail.com Mon Dec 6 06:45:51 2010 From: almar.klein at gmail.com (Almar Klein) Date: Mon, 6 Dec 2010 12:45:51 +0100 Subject: [SciPy-User] ANN: IEP 2.3 (the Interactive Editor for Python) In-Reply-To: References: Message-ID: On 27 November 2010 20:02, Almar Klein wrote: > > > On 26 November 2010 20:31, Christian K. wrote: > >> Hi Almar, >> >> Am 26.11.10 16:47, schrieb Almar Klein: >> > Hi all, >> > >> > I am pleased to announce version 2.3 of IEP, the interactive Editor for >> > Python. >> > >> > IEP is a cross-platform Python IDE focused on interactivity and >> > introspection, which makes it very suitable for scientific computing. >> > Its practical design is aimed at simplicity and efficiency. >> > >> > website: http://code.google.com/p/iep/ >> > downloads: http://code.google.com/p/iep/downloads/list >> > (binaries are available >> > for Windows, Linux and Mac) >> >> the mac binary does not work here. It looks for a python 3.1 >> installation in soem special place which I do not have: >> >> Dyld Error Message: >> Library not loaded: >> /opt/local/Library/Frameworks/Python.framework/Versions/3.1/Python >> Referenced from: /Applications/iep.app/Contents/MacOS/iep >> Reason: image not found >> > > A bug report has been filed: > http://code.google.com/p/iep/issues/detail?id=18 > A working binary is now available (there are remaining problems with Mac OS 10.5). On a related topic: the 32bit Linux binaries are now available with anti-aliased fonts. I'm now working on the 64bit Linux binaries (takes a day or two recompiling stuff). Almar -------------- next part -------------- An HTML attachment was scrubbed... URL: From dandavison7 at gmail.com Mon Dec 6 07:14:36 2010 From: dandavison7 at gmail.com (Dan Davison) Date: Mon, 06 Dec 2010 12:14:36 +0000 Subject: [SciPy-User] Installing on OSX 10.6 Snow Leopard Message-ID: <87d3pfhzdf.fsf@gmail.com> Hi, I'm failing to install scipy on Mac OSX 10.6. I would be happy to use the binary .dmg installer from sourceforge. But they detect the system python that Apple ships and refuse to install. I do have python26 installed -- how do I use the .dmg installer? In many places on the web it is said that the scipy installers "work with the python from python.org" rather than Apple's python, but I haven't seen any instruction as to how one accomplishes that. Thanks very much, Dan From ralf.gommers at googlemail.com Mon Dec 6 07:58:40 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 6 Dec 2010 20:58:40 +0800 Subject: [SciPy-User] Installing on OSX 10.6 Snow Leopard In-Reply-To: <87d3pfhzdf.fsf@gmail.com> References: <87d3pfhzdf.fsf@gmail.com> Message-ID: On Mon, Dec 6, 2010 at 8:14 PM, Dan Davison wrote: > Hi, > > I'm failing to install scipy on Mac OSX 10.6. I would be happy to use > the binary .dmg installer from sourceforge. But they detect the system > python that Apple ships and refuse to install. I do have python26 > installed -- how do I use the .dmg installer? In many places on the web > it is said that the scipy installers "work with the python from > python.org" rather than Apple's python, but I haven't seen any > instruction as to how one accomplishes that. > Download the dmg from http://www.python.org/ftp/python/2.6.6/ Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From lionel.roubeyrie at gmail.com Mon Dec 6 09:22:22 2010 From: lionel.roubeyrie at gmail.com (Lionel Roubeyrie) Date: Mon, 6 Dec 2010 15:22:22 +0100 Subject: [SciPy-User] kriging module In-Reply-To: <9eab2628-df85-4248-b0fa-0ba1a52b86af@p30g2000prb.googlegroups.com> References: <4CEA2731.63BA.009B.1@twdb.state.tx.us> <4CEB7F2F.63BA.009B.1@twdb.state.tx.us> <9eab2628-df85-4248-b0fa-0ba1a52b86af@p30g2000prb.googlegroups.com> Message-ID: Hi all, no really familiar with Git, I just see I have given a bad address... So, if anyone want to test, here is the good one : git at github.com:LionelR/krige.git I'll really appreciate any comment, thanks 2010/11/24 Anand Patil : > Hi everyone, > > > I'm the author PyMC's GP module. Sorry to come late to this thread. > The discussion of my module has been on target, and thanks very much > for the kind words... as everyone here knows it's nice when people > notice code that you've worked hard on. I have a couple of hopefully > relevant things to say about it. > > > First, the GP module is broader in scope than what people typically > mean by GP regression and kriging. The statistical model underlying > typical GPR/K says that the data are normally distributed with > expectations equal to the GP's value at particular, known locations. > Further, the mean and covariance parameters of the field, as well as > the variance of the data, are typically fixed before starting the > regression. > > With the GP module, the mean and covariance parameters can be unknown, > and the data can depend on the field in any way; as a random example, > each data point could be Gamma distributed, with parameters determined > by a nonlinear transformation of the field's value at several unknown > locations. > > That said, the module has a very pronounced fast path that restricts > its practical model space to Bayesian geostatistics, which means the > aforementioned locations have to be known before starting the > regression. This is still a superset of GPR/K. There are numerous > examples of the GP module in use for Bayesian geostatistics at > github.com/malaria-atlas-project. > > > Second, the parts of the GP module that would help with GPR/K are not > very tightly bound to either the rest of PyMC or the Bayesian > paradigm, and could be pulled out. These parts are the Mean, > Covariance and Realization objects, functions like observe and > point_predict, and their components; but not the GP submodels and step > methods mentioned in the user guide. > > > Any questions on the GP module are welcome at groups.google.com/p/ > pymc. I'm looking forward to checking out the work in progress on the > scikit. > > Cheers, > Anand > > On Nov 23, 2:45?pm, "Dharhas Pothina" > wrote: >> We were planning to project our irregular data onto a cartesian grid and try and use matplotlib to visualize the variograms. I don't think I know enough about the math ofkrigingto be of much help in the coding but I might be able to give your module a try if I can find time between deadlines. >> >> - dharhas >> >> >>> Lionel Roubeyrie 11/22/2010 9:15 AM >>> >> >> I have tried hpgl and had some discussions with one of the main >> developper, but hpgl works only on cartesian (regular) grid where I >> want to have the possibility to have predictions on irregular points >> and have the possibility to visualize variograms >> >> 2010/11/22 Dharhas Pothina : >> >> >> >> >> >> >> >> >> >> >> >> > What about this package?http://hpgl.sourceforge.net/ >> >> > I was looking for a kridging module recently and came across this. I haven't tried it out yet but am getting ready to. It uses numpy arrays and also is able to read/write GSLib files. GSLib seems to be a fairly established command line library in the Geostats world. >> >> > - dharhas >> >> > On Sat, Nov 20, 2010 at 12:56 PM, Lionel Roubeyrie < >> > lionel.roubey... at gmail.com> wrote: >> >> >> Hi all, >> >> I have written a simple module forkrigingcomputation (ordinary >> >>krigingfor the moment), it's not optimized and maybe some minors >> >> errors are inside but I think it delivers corrects results. Is there >> >> some people here that can help me for optimize the code or just to >> >> have a try? I don't know the politic of this mailing-list against >> >> joined files, so I don't send it here for now. >> >> Thanks >> >> >> -- >> >> Lionel Roubeyrie >> >> lionel.roubey... at gmail.com >> >>http://youarealegend.blogspot.com >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-U... at scipy.org >> >>http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-U... at scipy.org >> >http://mail.scipy.org/mailman/listinfo/scipy-user >> >> -- >> Lionel Roubeyrie >> lionel.roubey... at gmail.comhttp://youarealegend.blogspot.com >> _______________________________________________ >> SciPy-User mailing list >> SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Lionel Roubeyrie lionel.roubeyrie at gmail.com http://youarealegend.blogspot.com From carovi at utu.fi Mon Dec 6 13:47:12 2010 From: carovi at utu.fi (Carolin Villforth) Date: Mon, 06 Dec 2010 13:47:12 -0500 Subject: [SciPy-User] changing covariance factor in scipy.stats.kde.gaussian_kde Message-ID: <6C70B2C8-A065-4EE5-8C55-E50F80F1D5B2@utu.fi> Hello, I have a question concerning the usage of gaussian_kde. I am trying to change the covariance factor for the KDE, this is what I am doing it at the moment: myKDE = scipy.stats.kde.gaussian_kde(data) myKDE.covariance_factor = myKDE.silverman_factor myKDE._compute_covariance() The last line seems to be necessary for the changes to take effect. While the above code works, this might create quite an overhead since _compute_variance is executed twice, once in __init__ and then again after the covariance factor has been changed. If I understood correctly, this is not really necessary since silverman_factor does not depend on outputs from _compute_covariance. Also, I always assumed that one should avoid calling '._functions' from outside the class. Is there another way to change the covariance factor? Thanks Greetings Carolin ---------------------------------------------------------- Carolin Villforth PhD Student Tuorla Observatory Finland and Space Telescope Science Institute 3700 San Martin Drive 21218 Baltimore, MD USA phone: +1-410-338-4334 email: carovi at utu.fi, villfort at stsci.edu ---------------------------------------------------------- From robert.kern at gmail.com Mon Dec 6 13:48:39 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 6 Dec 2010 12:48:39 -0600 Subject: [SciPy-User] changing covariance factor in scipy.stats.kde.gaussian_kde In-Reply-To: <6C70B2C8-A065-4EE5-8C55-E50F80F1D5B2@utu.fi> References: <6C70B2C8-A065-4EE5-8C55-E50F80F1D5B2@utu.fi> Message-ID: On Mon, Dec 6, 2010 at 12:47, Carolin Villforth wrote: > Hello, > > I have a question concerning the usage of gaussian_kde. I am trying to change the covariance factor for the KDE, this is what I am doing it at the moment: > > myKDE = scipy.stats.kde.gaussian_kde(data) > myKDE.covariance_factor = myKDE.silverman_factor > myKDE._compute_covariance() > > The last line seems to be necessary for the changes to take effect. > > While the above code works, this might create quite an overhead since _compute_variance is executed twice, once in __init__ and then again after the covariance factor has been changed. If I understood correctly, this is not really necessary since silverman_factor does not depend on outputs from _compute_covariance. Also, I always assumed that one should avoid calling '._functions' from outside the class. > > Is there another way to change the covariance factor? Subclass. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From nwagner at iam.uni-stuttgart.de Mon Dec 6 13:55:42 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 06 Dec 2010 19:55:42 +0100 Subject: [SciPy-User] scipy 0.9.0.dev6984 test failures In-Reply-To: References: Message-ID: >> Then can you investigate a bit? Are there nans/infs in >>one of the outputs? > The parts of the arrays that are printed look exactly >the same. O.k. the arrays d and e differ in one place - 88 versus 11. [[ 0. 50. 50. 50. 42. 15. 15. 18. 27. 0.] [ 0. 50. 50. 50. 50. 42. 19. 21. 29. 0.] [ 50. 50. 50. 50. 50. 47. 34. 34. 46. 35.] [ 50. 50. 50. 50. 50. 50. 46. 47. 65. 42.] [ 50. 50. 50. 50. 50. 50. 46. 55. 66. 35.] [ 33. 50. 50. 50. 50. 47. 46. 47. 58. 26.] [ 32. 50. 50. 50. 50. 50. 46. 45. 58. 26.] [ 7. 46. 50. 50. 47. 46. 46. 43. 45. 21.] [ 0. 32. 33. 39. 32. 32. 43. 43. 43. 0.] [ 0. 7. 88. 7. 4. 4. 19. 19. 24. 0.]] [[ 0. 50. 50. 50. 42. 15. 15. 18. 27. 0.] [ 0. 50. 50. 50. 50. 42. 19. 21. 29. 0.] [ 50. 50. 50. 50. 50. 47. 34. 34. 46. 35.] [ 50. 50. 50. 50. 50. 50. 42. 47. 64. 42.] [ 50. 50. 50. 50. 50. 50. 46. 55. 64. 35.] [ 33. 50. 50. 50. 50. 47. 46. 43. 55. 26.] [ 32. 50. 50. 50. 50. 47. 46. 45. 55. 26.] [ 7. 46. 50. 50. 47. 46. 46. 43. 45. 21.] [ 0. 32. 33. 39. 32. 32. 43. 43. 43. 0.] [ 0. 7. 11. 7. 4. 4. 19. 19. 24. 0.]] Nils From carovi at utu.fi Mon Dec 6 14:07:52 2010 From: carovi at utu.fi (Carolin Villforth) Date: Mon, 06 Dec 2010 14:07:52 -0500 Subject: [SciPy-User] changing covariance factor in scipy.stats.kde.gaussian_kde In-Reply-To: References: <6C70B2C8-A065-4EE5-8C55-E50F80F1D5B2@utu.fi> Message-ID: <01EA5DE5-29B5-4DCD-B9FE-863C35AB33E1@utu.fi> Do you mean I should inherit gaussian_kde into my own class and then override covariance_factor in the inherited class? Thanks On Dec 6, 2010, at 1:48 PM, Robert Kern wrote: > On Mon, Dec 6, 2010 at 12:47, Carolin Villforth wrote: >> Hello, >> >> I have a question concerning the usage of gaussian_kde. I am trying to change the covariance factor for the KDE, this is what I am doing it at the moment: >> >> myKDE = scipy.stats.kde.gaussian_kde(data) >> myKDE.covariance_factor = myKDE.silverman_factor >> myKDE._compute_covariance() >> >> The last line seems to be necessary for the changes to take effect. >> >> While the above code works, this might create quite an overhead since _compute_variance is executed twice, once in __init__ and then again after the covariance factor has been changed. If I understood correctly, this is not really necessary since silverman_factor does not depend on outputs from _compute_covariance. Also, I always assumed that one should avoid calling '._functions' from outside the class. >> >> Is there another way to change the covariance factor? > > Subclass. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user ---------------------------------------------------------- Carolin Villforth PhD Student Tuorla Observatory Finland and Space Telescope Science Institute 3700 San Martin Drive 21218 Baltimore, MD USA phone: +1-410-338-4334 email: carovi at utu.fi, villfort at stsci.edu ---------------------------------------------------------- From josef.pktd at gmail.com Mon Dec 6 14:26:30 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 6 Dec 2010 14:26:30 -0500 Subject: [SciPy-User] changing covariance factor in scipy.stats.kde.gaussian_kde In-Reply-To: <01EA5DE5-29B5-4DCD-B9FE-863C35AB33E1@utu.fi> References: <6C70B2C8-A065-4EE5-8C55-E50F80F1D5B2@utu.fi> <01EA5DE5-29B5-4DCD-B9FE-863C35AB33E1@utu.fi> Message-ID: On Mon, Dec 6, 2010 at 2:07 PM, Carolin Villforth wrote: > Do you mean I should inherit gaussian_kde into my own class and then override ?covariance_factor in the inherited class? yes http://stackoverflow.com/questions/2678425/fitting-gaussian-kde-in-numpy-scipy-in-python http://mail.scipy.org/pipermail/scipy-user/2010-January/023877.html I never checked if there are redundant calculations Josef > > Thanks > > On Dec 6, 2010, at 1:48 PM, Robert Kern wrote: > >> On Mon, Dec 6, 2010 at 12:47, Carolin Villforth wrote: >>> Hello, >>> >>> I have a question concerning the usage of gaussian_kde. I am trying to change the covariance factor for the KDE, this is what I am doing it at the moment: >>> >>> myKDE = scipy.stats.kde.gaussian_kde(data) >>> myKDE.covariance_factor = myKDE.silverman_factor >>> myKDE._compute_covariance() >>> >>> The last line seems to be necessary for the changes to take effect. >>> >>> While the above code works, this might create quite an overhead since _compute_variance is executed twice, once in __init__ and then again after the covariance factor has been changed. If I understood correctly, this is not really necessary since silverman_factor does not depend on outputs from _compute_covariance. Also, I always assumed that one should avoid calling '._functions' from outside the class. >>> >>> Is there another way to change the covariance factor? >> >> Subclass. >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> ? -- Umberto Eco >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > ---------------------------------------------------------- > Carolin Villforth > PhD Student > Tuorla Observatory Finland and > Space Telescope Science Institute > 3700 San Martin Drive > 21218 Baltimore, MD > USA > phone: +1-410-338-4334 > email: carovi at utu.fi, villfort at stsci.edu > ---------------------------------------------------------- > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jsseabold at gmail.com Mon Dec 6 17:34:19 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 6 Dec 2010 17:34:19 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? Message-ID: I'm wondering if anyone might have a look at my cython code that does matrix multiplication and see where I can speed it up or offer some pointers/reading. I'm new to Cython and my knowledge of C is pretty basic based on trial and (mostly) error, so I am sure the code is still very naive. import numpy as np from matmult import dotAB, multAB A = np.array([[ 1., 3., 4.], [ 5., 6., 3.]]) B = A.T.copy() timeit dotAB(A,B) # 1 loops, best of 3: 826 ms per loop timeit multAB(A,B) # 1 loops, best of 3: 1.16 s per loop As you can see my multAB results in a negative speedup of about .75. I compile the cython code with cython -a matmult.pyx gcc -shared -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing -I/usr/include/python2.6 -I/usr/local/lib/python2.6/dist-packages/numpy/core/include/ -o matmult.so matmult.c Cython code is attached and inlined below. Profile is here (some of which I don't understand why there are bottlenecks) http://eagle1.american.edu/~js2796a/matmult/matmult.html ----------------------------------------------------------- from numpy cimport float64_t, ndarray, NPY_DOUBLE, npy_intp cimport cython from numpy import dot ctypedef float64_t DOUBLE cdef extern from "numpy/arrayobject.h": cdef void import_array() cdef object PyArray_SimpleNew(int nd, npy_intp *dims, int typenum) import_array() @cython.boundscheck(False) @cython.wraparound(False) cdef inline object matmult(ndarray[DOUBLE, ndim=2, mode='c'] A, ndarray[DOUBLE, ndim=2, mode='c'] B): cdef int lda = A.shape[0] cdef int n = B.shape[1] cdef npy_intp *dims = [lda, n] cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE) cdef int i,j,k cdef double s for i in xrange(lda): for j in xrange(n): s = 0 for k in xrange(A.shape[1]): s += A[i,k] * B[k,j] out[i,j] = s return out def multAB(ndarray[DOUBLE, ndim=2] A, ndarray[DOUBLE, ndim=2] B): for i in xrange(1000000): C = matmult(A,B) return C def dotAB(ndarray[DOUBLE, ndim=2] A, ndarray[DOUBLE, ndim=2] B): for i in xrange(1000000): C = dot(A,B) return C Skipper -------------- next part -------------- A non-text attachment was scrubbed... Name: matmult.pyx Type: application/octet-stream Size: 1249 bytes Desc: not available URL: From pav at iki.fi Mon Dec 6 19:11:12 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 7 Dec 2010 00:11:12 +0000 (UTC) Subject: [SciPy-User] fast small matrix multiplication with cython? References: Message-ID: On Mon, 06 Dec 2010 17:34:19 -0500, Skipper Seabold wrote: > I'm wondering if anyone might have a look at my cython code that does > matrix multiplication and see where I can speed it up or offer some > pointers/reading. I'm new to Cython and my knowledge of C is pretty > basic based on trial and (mostly) error, so I am sure the code is still > very naive. You'll be hard pressed to do better than Numpy's dot. In the raw data handling, BLAS is very likely faster than most things you can code manually. Moreover, the Cython routine you write must have as much overhead as dot() --- dealing with refcounting, allocating/dellocating PyArrayObjects (which is expensive) etc. If you are willing to give up wrapping each small matrix in a separate Numpy ndarray, then you can expect to get additional speed gains. (Although even in that case it could make more sense to call BLAS routines to do the multiplication instead, unless your matrices are small and of fixed size in which case the C compiler may be able to produce some tightly optimized code.) However, in many cases the small matrices can be just stuffed into a single Numpy array. At the moment there is no "vectorized" matrix multiplication routine, however, so that could be written e.g. in Cython. -- Pauli Virtanen From robert.kern at gmail.com Mon Dec 6 19:23:12 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 6 Dec 2010 18:23:12 -0600 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Mon, Dec 6, 2010 at 18:11, Pauli Virtanen wrote: > On Mon, 06 Dec 2010 17:34:19 -0500, Skipper Seabold wrote: >> I'm wondering if anyone might have a look at my cython code that does >> matrix multiplication and see where I can speed it up or offer some >> pointers/reading. ?I'm new to Cython and my knowledge of C is pretty >> basic based on trial and (mostly) error, so I am sure the code is still >> very naive. > > You'll be hard pressed to do better than Numpy's dot. In the raw data > handling, BLAS is very likely faster than most things you can code > manually. Moreover, the Cython routine you write must have as much > overhead as dot() --- dealing with refcounting, allocating/dellocating > PyArrayObjects (which is expensive) etc. The main thing for his use case is reducing the overhead when called from Cython. This started in a Cython-user thread where he was directly calling the Python numpy.dot() from Cython. I suggested that writing a Cython implementation may be better given the small dimensions (only up to 10x10) might be better handled by writing the matmult directly. Unfortunately, the buffer syntax adds a bunch of overhead. Not the *same* overhead, mind, and I was hoping it would be less, but it turns out to be more. Getting access to the C BLAS implementations would be best. I guess you could get descr.f.dotfunc and use that. > If you are willing to give up wrapping each small matrix in a separate > Numpy ndarray, then you can expect to get additional speed gains. > (Although even in that case it could make more sense to call BLAS > routines to do the multiplication instead, unless your matrices are small > and of fixed size in which case the C compiler may be able to produce > some tightly optimized code.) > > However, in many cases the small matrices can be just stuffed into a > single Numpy array. His use case (Kalman filters) prevents this. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From jsseabold at gmail.com Mon Dec 6 19:30:29 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 6 Dec 2010 19:30:29 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Mon, Dec 6, 2010 at 7:11 PM, Pauli Virtanen wrote: > > On Mon, 06 Dec 2010 17:34:19 -0500, Skipper Seabold wrote: > > I'm wondering if anyone might have a look at my cython code that does > > matrix multiplication and see where I can speed it up or offer some > > pointers/reading. ?I'm new to Cython and my knowledge of C is pretty > > basic based on trial and (mostly) error, so I am sure the code is still > > very naive. > > You'll be hard pressed to do better than Numpy's dot. In the raw data > handling, BLAS is very likely faster than most things you can code > manually. Moreover, the Cython routine you write must have as much > overhead as dot() --- dealing with refcounting, allocating/dellocating > PyArrayObjects (which is expensive) etc. > > If you are willing to give up wrapping each small matrix in a separate > Numpy ndarray, then you can expect to get additional speed gains. > (Although even in that case it could make more sense to call BLAS > routines to do the multiplication instead, unless your matrices are small > and of fixed size in which case the C compiler may be able to produce > some tightly optimized code.) > > However, in many cases the small matrices can be just stuffed into a > single Numpy array. At the moment there is no "vectorized" matrix > multiplication routine, however, so that could be written e.g. in Cython. > Ah, I see. I didn't think about the overhead of PyArrayObject. Skipper From ptittmann at gmail.com Mon Dec 6 19:31:13 2010 From: ptittmann at gmail.com (Peter Tittmann) Date: Mon, 6 Dec 2010 16:31:13 -0800 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> Message-ID: <2061813969B94BD39B4BFFDABFFC2D00@gmail.com> thanks both of you, Josef, the data that I sent is only the first 100 rows of about 1500, there should be sufficient sampling in each plot. Skipper, I have attempted to deploy your suggestion for not linearizing the data. It seems to work. I'm a little confused at your modification if the getDiam function and I wonder if you could help me understand. The form of the equation that is being fit is: Y= a*X^b your version of the detDaim function: > > > > > def getDiam(ht, *b): > > return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1)@gmail.com>@gmail.com> > > > > > > Im sorry if this is an obvious question but I don't understand how this works as it seems that the "a" coefficient is missing. Thanks again! -- Peter Tittmann On Thursday, December 2, 2010 at 5:03 PM, josef.pktd at gmail.com wrote: > On Thu, Dec 2, 2010 at 7:57 PM, Skipper Seabold wrote: > > > On Thu, Dec 2, 2010 at 7:11 PM, Skipper Seabold wrote: > > > On Thu, Dec 2, 2010 at 5:59 PM, Peter Tittmann wrote: > > >> getDiam is a predictor to get dbh from height. It works with curve_fit to > > >> find coefficients a and b given datasetset of known dbh/height pairs. You > > >> are right, what I want is dummy variables for each plot. I'll see if I can > > >> get that worked out by revising getDiam.. > > >> Thanks again > > >> > > > > > > I think it would be easier to create your dummy variables before you pass it in. > > > > > > You might find some of the tools in statsmodels to be helpful here. > > > We don't yet have an ANCOVA model, but you could definitely do > > > something like the following. Not sure if it's exactly what you want, > > > but it should give you an idea. > > > > > > import numpy as np > > > import scikits.statsmodels as sm > > > > > > dta = np.genfromtxt('./db_out.csv', delimiter=",", names=True, dtype=None) > > > plot_dummies, col_map = sm.tools.categorical(dta['plot'], drop=True, > > > dictnames=True) > > > > > > plot_dummies will be dummy variables for all of the "plot" categories, > > > and col_map is a map from the column number to the plot just so you > > > can be sure you know what's what. > > > > > > I don't see how to use your objective function though with dummy > > > variables. What happens if the effect of one of the plots is > > > negative, then you run into 0 ** -1.5 == inf. > > > > > > > If you want to do NLLS and not linearize then something like this > > might work and still keep the dummy variables as shift parameters > > > > def getDiam(ht, *b): > > return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) > > > > X = np.column_stack((indHtPlot, plot_dummies)) > > Y = depDbh > > coefs, cov = optimize.curve_fit(getDiam, X, Y, p0= [0.]*X.shape[1]) > > @gmail.com>@gmail.com> > > > > In the sample file there are 11 levels of the `plot` that have only a > single observation each. I tried to use onewaygls, but statsmodels.OLS > doesn't work if y is a scalar. > > I don't know whether curvefit or optimize.leastsq will converge in > this case, good starting values might be necessary. > > Josef > > > > > > > > > > You could linearize your objective function to be > > > > > > b*ln(ht) > > > > > > and do something like > > > > > > indHtPlot = dta['height'] > > > depDbh = dta['dbh'] > > > X = np.column_stack((np.log(indHtPlot), plot_dummies)) > > > Y = np.log(depDbh) > > > res = sm.OLS(Y,X).fit() > > > res.params > > > array([ 0.98933264, -1.35239293, -1.0623305 , -0.99155293, -1.33675099, > > > -1.30657011, -1.50933751, -1.28744779, -1.43937358, -1.33805883, > > > -1.32744257, -1.42672539, -1.35239293, -1.60585046, -1.45239093, > > > -1.45695112, -1.34811186, -1.32658794, -1.21721715, -1.32853084, > > > -1.45775017, -1.44460388, -2.19065236, -1.3303631 , -1.20509831, > > > -1.37341535, -1.25746105, -1.33954972, -1.33922709, -1.247304 ]) > > > > > > Note that your coefficient on height is now an elasticity. I'm sure > > > I'm missing something here, but that might help you along the way. > > > > > > Skipper > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Mon Dec 6 19:31:28 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 6 Dec 2010 16:31:28 -0800 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold wrote: > @cython.boundscheck(False) > @cython.wraparound(False) > cdef inline object matmult(ndarray[DOUBLE, ndim=2, mode='c'] A, > ? ? ? ? ? ? ? ? ? ?ndarray[DOUBLE, ndim=2, mode='c'] B): > ? ?cdef int lda = A.shape[0] > ? ?cdef int n = B.shape[1] > ? ?cdef npy_intp *dims = [lda, n] > ? ?cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE) > ? ?cdef int i,j,k > ? ?cdef double s Do the cdef's above take a sizeable fraction of the time given that your input arrays are small? If so, then you could do those before you enter the inner loop where the dot product is needed. You wouldn't end up with a reusable matmult function, but you'd get rid of some overhead. So in your inner loop, you'd only have: > ? ?for i in xrange(lda): > ? ? ? ?for j in xrange(n): > ? ? ? ? ? ?s = 0 > ? ? ? ? ? ?for k in xrange(A.shape[1]): > ? ? ? ? ? ? ? ?s += A[i,k] * B[k,j] > ? ? ? ? ? ?out[i,j] = s > ? ?return out From jsseabold at gmail.com Mon Dec 6 19:31:26 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 6 Dec 2010 19:31:26 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Mon, Dec 6, 2010 at 7:23 PM, Robert Kern wrote: > On Mon, Dec 6, 2010 at 18:11, Pauli Virtanen wrote: >> On Mon, 06 Dec 2010 17:34:19 -0500, Skipper Seabold wrote: >>> I'm wondering if anyone might have a look at my cython code that does >>> matrix multiplication and see where I can speed it up or offer some >>> pointers/reading. ?I'm new to Cython and my knowledge of C is pretty >>> basic based on trial and (mostly) error, so I am sure the code is still >>> very naive. >> >> You'll be hard pressed to do better than Numpy's dot. In the raw data >> handling, BLAS is very likely faster than most things you can code >> manually. Moreover, the Cython routine you write must have as much >> overhead as dot() --- dealing with refcounting, allocating/dellocating >> PyArrayObjects (which is expensive) etc. > > The main thing for his use case is reducing the overhead when called > from Cython. This started in a Cython-user thread where he was > directly calling the Python numpy.dot() from Cython. I suggested that > writing a Cython implementation may be better given the small > dimensions (only up to 10x10) might be better handled by writing the > matmult directly. Unfortunately, the buffer syntax adds a bunch of > overhead. Not the *same* overhead, mind, and I was hoping it would be > less, but it turns out to be more. > Sorry for the cross-post. I figured this was better hashed out over here. > Getting access to the C BLAS implementations would be best. I guess > you could get descr.f.dotfunc and use that. > Thanks, I will see what I can come up with. I know it can be sped up since other software in C++ solves the whole optimization almost instantaneously when mine takes ~5 seconds for the same case, and my profiling says that most of the time is spent in the loglikelihood loop. >> If you are willing to give up wrapping each small matrix in a separate >> Numpy ndarray, then you can expect to get additional speed gains. >> (Although even in that case it could make more sense to call BLAS >> routines to do the multiplication instead, unless your matrices are small >> and of fixed size in which case the C compiler may be able to produce >> some tightly optimized code.) >> >> However, in many cases the small matrices can be just stuffed into a >> single Numpy array. > > His use case (Kalman filters) prevents this. > For posterity's sake. More akin to my actual problem. http://groups.google.com/group/cython-users/browse_thread/thread/a605a70626a455d > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Mon Dec 6 19:33:13 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 6 Dec 2010 19:33:13 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Mon, Dec 6, 2010 at 5:34 PM, Skipper Seabold wrote: > I'm wondering if anyone might have a look at my cython code that does > matrix multiplication and see where I can speed it up or offer some > pointers/reading. ?I'm new to Cython and my knowledge of C is pretty > basic based on trial and (mostly) error, so I am sure the code is > still very naive. > > import numpy as np > from matmult import dotAB, multAB > > A = np.array([[ 1., ?3., ?4.], > ? ? ? ? ? ? ? ? ? [ 5., ?6., ?3.]]) > B = A.T.copy() > > timeit dotAB(A,B) > # 1 loops, best of 3: 826 ms per loop > > timeit multAB(A,B) > # 1 loops, best of 3: 1.16 s per loop > > As you can see my multAB results in a negative speedup of about .75. > > I compile the cython code with > > cython -a matmult.pyx > gcc -shared -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing > -I/usr/include/python2.6 > -I/usr/local/lib/python2.6/dist-packages/numpy/core/include/ -o > matmult.so matmult.c > > Cython code is attached and inlined below. > > Profile is here (some of which I don't understand why there are > bottlenecks) http://eagle1.american.edu/~js2796a/matmult/matmult.html > ----------------------------------------------------------- > > from numpy cimport float64_t, ndarray, NPY_DOUBLE, npy_intp > cimport cython > from numpy import dot > > ctypedef float64_t DOUBLE > > cdef extern from "numpy/arrayobject.h": > ? ?cdef void import_array() > ? ?cdef object PyArray_SimpleNew(int nd, npy_intp *dims, int typenum) > > import_array() > > @cython.boundscheck(False) > @cython.wraparound(False) > cdef inline object matmult(ndarray[DOUBLE, ndim=2, mode='c'] A, > ? ? ? ? ? ? ? ? ? ?ndarray[DOUBLE, ndim=2, mode='c'] B): > ? ?cdef int lda = A.shape[0] > ? ?cdef int n = B.shape[1] > ? ?cdef npy_intp *dims = [lda, n] > ? ?cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE) > ? ?cdef int i,j,k > ? ?cdef double s > ? ?for i in xrange(lda): > ? ? ? ?for j in xrange(n): > ? ? ? ? ? ?s = 0 > ? ? ? ? ? ?for k in xrange(A.shape[1]): > ? ? ? ? ? ? ? ?s += A[i,k] * B[k,j] > ? ? ? ? ? ?out[i,j] = s > ? ?return out > > def multAB(ndarray[DOUBLE, ndim=2] A, ndarray[DOUBLE, ndim=2] B): > ? ?for i in xrange(1000000): > ? ? ? ?C = matmult(A,B) > ? ?return C Does this generate c code, since it's not a cdef ? (I haven't updated cython in a while.) I guess you would want to have the entire loop in c. Josef > > def dotAB(ndarray[DOUBLE, ndim=2] A, ndarray[DOUBLE, ndim=2] B): > ? ?for i in xrange(1000000): > ? ? ? ?C = dot(A,B) > ? ?return C > > Skipper > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From jsseabold at gmail.com Mon Dec 6 19:41:09 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 6 Dec 2010 19:41:09 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: <2061813969B94BD39B4BFFDABFFC2D00@gmail.com> References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> <2061813969B94BD39B4BFFDABFFC2D00@gmail.com> Message-ID: On Mon, Dec 6, 2010 at 7:31 PM, Peter Tittmann wrote: > thanks both of you, > Josef, the data that I sent is only the first 100 rows of about 1500, there > should be sufficient sampling in each plot. > Skipper, I have attempted to deploy your suggestion for not linearizing the > data. It seems to work. I'm a little confused at your modification if the > getDiam function and I wonder if you could help me understand. The form of > the equation that is being fit is: > Y= a*X^b > your version of the detDaim function: > > def getDiam(ht, *b): > ? ?return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) > > Im sorry if this is an obvious question but I don't understand how this > works as it seems that the "a" coefficient is missing. > Thanks again! Right. I took out the 'a', because as I read it when I linearized (I might be misunderstanding ancova, I never recall the details), if you include 'a' and also all of the dummy variables for the plot, then you will have a the problem of multicollinearity. You could also include 'a' and drop one of the plot dummies, but then 'a' is just your reference category that you dropped. So now b[0] is the nonlinear effect of your main variable and b[1:] contains linear shift effects of all the plots. Hmm, thinking about it some more, though I think you could include 'a' in the non-linear version above (call it b[0] and shift everything else over by one), because now 'a' would be the effect when the current b[0] is zero. I was just unsure how you meant 'a' when you had a*ht**b and were trying to include in ht the plot variable dummies. Skipper From josef.pktd at gmail.com Mon Dec 6 19:55:04 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 6 Dec 2010 19:55:04 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> <2061813969B94BD39B4BFFDABFFC2D00@gmail.com> Message-ID: On Mon, Dec 6, 2010 at 7:41 PM, Skipper Seabold wrote: > On Mon, Dec 6, 2010 at 7:31 PM, Peter Tittmann wrote: >> thanks both of you, >> Josef, the data that I sent is only the first 100 rows of about 1500, there >> should be sufficient sampling in each plot. >> Skipper, I have attempted to deploy your suggestion for not linearizing the >> data. It seems to work. I'm a little confused at your modification if the >> getDiam function and I wonder if you could help me understand. The form of >> the equation that is being fit is: >> Y= a*X^b >> your version of the detDaim function: >> >> def getDiam(ht, *b): >> ? ?return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) >> >> Im sorry if this is an obvious question but I don't understand how this >> works as it seems that the "a" coefficient is missing. >> Thanks again! > > Right. ?I took out the 'a', because as I read it when I linearized (I > might be misunderstanding ancova, I never recall the details), if you > include 'a' and also all of the dummy variables for the plot, then you > will have a the problem of multicollinearity. ?You could also include > 'a' and drop one of the plot dummies, but then 'a' is just your > reference category that you dropped. ?So now b[0] is the nonlinear > effect of your main variable and b[1:] contains linear shift effects > of all the plots. ?Hmm, thinking about it some more, though I think > you could include 'a' in the non-linear version above (call it b[0] > and shift everything else over by one), because now 'a' would be the > effect when the current b[0] is zero. ?I was just unsure how you meant > 'a' when you had a*ht**b and were trying to include in ht the plot > variable dummies. As I understand it, the intention is to estimate equality of the slope coefficients, so the continuous variable is multiplied with the dummy variables. In this case, the constant should still be added. The normalization question is whether to include all dummy-cont.variable products and drop the continuous variable, or include the continuous variable and drop one of the dummy-cont levels. Unless there is a strong reason to avoid log-normality of errors, I would work (first) with the linear version. Josef > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ptittmann at gmail.com Mon Dec 6 23:00:19 2010 From: ptittmann at gmail.com (Peter Tittmann) Date: Mon, 6 Dec 2010 20:00:19 -0800 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> <2061813969B94BD39B4BFFDABFFC2D00@gmail.com> Message-ID: <80C1F8072EB74C4F903BC96C09FB1D6A@gmail.com> Gentlemen, I've decided to switch to the OLS method, thought I did get the NNLS method that Skipper proposed working. I was not prepared to spend more time trying to make sense of the resulting array for ancova, etc. (also could not figure out how to interpret the resulting coefficient array as I was expecting a 2d array representing the a and b coefficient values but it returned a 1d array). I have hopefully simple follow up questions: 1. Is there a method to define explicitly the function used in OLS? I know numpy.linalg.lstsq is the way OLS works but is there another function where I can define the form? 2. I'm still interested in interpreting the results of the NNLS method, so if either of you can suggest what the resulting arrays mean id be grateful. I've attached the output of NNLS warm regards, Peter Here is the working version of NNLS: def getDiam2(ht,*b): return b[0] * ht[:,1]**b[1] + np.sum(b[2:]*ht[:,2:], axis=1) dt = np.genfromtxt('/home/peter/Desktop/db_out.csv', delimiter=",", names=True, dtype=None) indHtPlot = adt['height'] depDbh = adt['dbh'] plot_dummies, col_map = sm.tools.categorical(dt['plot], drop=True, dictnames=True) def nnlsDummies(): '''this function returns coefficients and covariance arrays''' plot_dummies, col_map = sm.tools.categorical(indPlot, drop=True, dictnames=True) X = np.column_stack((indHt, plot_dummies)) Y = depDbh coefs, cov = curve_fit(getDiam2, X, Y, p0= [0.]*X.shape[1]) return coefs, cov -- Peter Tittmann On Monday, December 6, 2010 at 4:55 PM, josef.pktd at gmail.com wrote: > On Mon, Dec 6, 2010 at 7:41 PM, Skipper Seabold wrote: > > > On Mon, Dec 6, 2010 at 7:31 PM, Peter Tittmann wrote: > > > thanks both of you, > > > Josef, the data that I sent is only the first 100 rows of about 1500, there > > > should be sufficient sampling in each plot. > > > Skipper, I have attempted to deploy your suggestion for not linearizing the > > > data. It seems to work. I'm a little confused at your modification if the > > > getDiam function and I wonder if you could help me understand. The form of > > > the equation that is being fit is: > > > Y= a*X^b > > > your version of the detDaim function: > > > > > > def getDiam(ht, *b): > > > return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) > > > > > > Im sorry if this is an obvious question but I don't understand how this > > > works as it seems that the "a" coefficient is missing. > > > Thanks again! > > > > Right. I took out the 'a', because as I read it when I linearized (I > > might be misunderstanding ancova, I never recall the details), if you > > include 'a' and also all of the dummy variables for the plot, then you > > will have a the problem of multicollinearity. You could also include > > 'a' and drop one of the plot dummies, but then 'a' is just your > > reference category that you dropped. So now b[0] is the nonlinear > > effect of your main variable and b[1:] contains linear shift effects > > of all the plots. Hmm, thinking about it some more, though I think > > you could include 'a' in the non-linear version above (call it b[0] > > and shift everything else over by one), because now 'a' would be the > > effect when the current b[0] is zero. I was just unsure how you meant > > 'a' when you had a*ht**b and were trying to include in ht the plot > > variable dummies. > > @gmail.com> > > > > As I understand it, the intention is to estimate equality of the slope > coefficients, so the continuous variable is multiplied with the dummy > variables. In this case, the constant should still be added. The > normalization question is whether to include all dummy-cont.variable > products and drop the continuous variable, or include the continuous > variable and drop one of the dummy-cont levels. > > Unless there is a strong reason to avoid log-normality of errors, I > would work (first) with the linear version. > > Josef > > > > > > > Skipper > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nnls_out.rtf Type: application/octet-stream Size: 5564 bytes Desc: not available URL: From josef.pktd at gmail.com Tue Dec 7 01:10:04 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 7 Dec 2010 01:10:04 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: <80C1F8072EB74C4F903BC96C09FB1D6A@gmail.com> References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> <2061813969B94BD39B4BFFDABFFC2D00@gmail.com> <80C1F8072EB74C4F903BC96C09FB1D6A@gmail.com> Message-ID: On Mon, Dec 6, 2010 at 11:00 PM, Peter Tittmann wrote: > Gentlemen, > I've decided to switch to the OLS method, thought I did get the NNLS method > that Skipper proposed working. I was not prepared to spend more time trying > to make sense of the resulting array for ancova, etc. (also could not figure > out how to interpret the resulting coefficient array as I was expecting a 2d > array representing the a and b coefficient values but it returned a 1d > array). I have hopefully simple follow up questions: > 1. Is there a method to define explicitly the function used in OLS? I know > numpy.linalg.lstsq is the way OLS works but is there another function where > I can define the form? > 2. I'm still interested in interpreting the results of the NNLS method, so > if either of you can suggest what the resulting arrays mean id be grateful. > I've attached the output of NNLS > warm regards, > Peter > > Here is the working version of NNLS: > def getDiam2(ht,*b): > ?? ?return b[0] * ht[:,1]**b[1] + np.sum(b[2:]*ht[:,2:], axis=1) > dt = np.genfromtxt('/home/peter/Desktop/db_out.csv', delimiter=",", > names=True, dtype=None) > > indHtPlot = adt['height'] > depDbh = adt['dbh'] > plot_dummies, col_map = sm.tools.categorical(dt['plot], drop=True, > dictnames=True) > > def nnlsDummies(): > ?? ?'''this function returns coefficients and covariance arrays''' > ?? ?plot_dummies, col_map = sm.tools.categorical(indPlot, drop=True, > dictnames=True) > ?? ?X = np.column_stack((indHt, plot_dummies)) > ?? ?Y = depDbh > ?? ?coefs, cov = curve_fit(getDiam2, X, Y, p0= [0.]*X.shape[1]) > ?? ?return coefs, cov (can you post at the bottom instead of the top, that's the custom on this mailing list) getDiam is just a linear model in this case, the first coefficient should be the effect/slope of indHt, the remaining are the constants/intercept for each (level of) "plot". According to your output there would be 301 or 302 unique values in your plot array. np.unique(indPlot) The hypothesis that there are no differences across plots mean that all the coefficients (except the first) are the same. An f-test would be the usual to check this. If instead you want to check that the effect/coefficient of Ht is independent of plot, then you should use the product indHt[:,None]*plot_dummies (all plot dummies, use drop=False) If you already have statsmodels, then you could estimate the original linear model that Skipper described, take y=np.log(depDbh) and x = sm.add_constant(np.log(indHt)[:,None]*plot_dummies) then you can estimate res = sm.OLS(y.x).fit() res.params are the parameters Then you can do an f_test, which depends on the version of statsmodels that you have. You can also do an f_test with the results from the non-linear curve_fit. I guess the easiest will be to estimate the model with and without dummies, and compare the residual sum of squares with scipy.stats.f_anova (?). Josef > > -- > Peter Tittmann > > On Monday, December 6, 2010 at 4:55 PM, josef.pktd at gmail.com wrote: > > On Mon, Dec 6, 2010 at 7:41 PM, Skipper Seabold wrote: > > On Mon, Dec 6, 2010 at 7:31 PM, Peter Tittmann wrote: >> thanks both of you, >> Josef, the data that I sent is only the first 100 rows of about 1500, >> there >> should be sufficient sampling in each plot. >> Skipper, I have attempted to deploy your suggestion for not linearizing >> the >> data. It seems to work. I'm a little confused at your modification if the >> getDiam function and I wonder if you could help me understand. The form of >> the equation that is being fit is: >> Y= a*X^b >> your version of the detDaim function: >> >> def getDiam(ht, *b): >> ? ?return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) >> >> Im sorry if this is an obvious question but I don't understand how this >> works as it seems that the "a" coefficient is missing. >> Thanks again! > > Right. ?I took out the 'a', because as I read it when I linearized (I > might be misunderstanding ancova, I never recall the details), if you > include 'a' and also all of the dummy variables for the plot, then you > will have a the problem of multicollinearity. ?You could also include > 'a' and drop one of the plot dummies, but then 'a' is just your > reference category that you dropped. ?So now b[0] is the nonlinear > effect of your main variable and b[1:] contains linear shift effects > of all the plots. ?Hmm, thinking about it some more, though I think > you could include 'a' in the non-linear version above (call it b[0] > and shift everything else over by one), because now 'a' would be the > effect when the current b[0] is zero. ?I was just unsure how you meant > 'a' when you had a*ht**b and were trying to include in ht the plot > variable dummies. > > As I understand it, the intention is to estimate equality of the slope > coefficients, so the continuous variable is multiplied with the dummy > variables. In this case, the constant should still be added. The > normalization question is whether to include all dummy-cont.variable > products and drop the continuous variable, or include the continuous > variable and drop one of the dummy-cont levels. > > Unless there is a strong reason to avoid log-normality of errors, I > would work (first) with the linear version. > > Josef > > > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From fperez.net at gmail.com Tue Dec 7 01:56:17 2010 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 6 Dec 2010 22:56:17 -0800 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: Hi Skipper, On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold wrote: > I'm wondering if anyone might have a look at my cython code that does > matrix multiplication and see where I can speed it up or offer some > pointers/reading. ?I'm new to Cython and my knowledge of C is pretty > basic based on trial and (mostly) error, so I am sure the code is > still very naive. a few years ago I had a similar problem, and I ended up getting a very significant speedup by hand-coding a very unsafe, but very fast pure C extension just to compute these inner products. This was basically a replacement for dot() that would only work with double precision inputs of compatible dimensions and would happily segfault with anything else, but it ran very fast. The inner loop is implemented completely naively, but it still beats calls to BLAS (even linked with ATLAS) for small matrix dimensions (my case was also up to ~ 15x15). I'm attaching the code in case you find it useful, please keep in mind I haven't compiled it in years, so it may have bit-rotted a little. Cheers, f -------------- next part -------------- A non-text attachment was scrubbed... Name: flinalg.c Type: text/x-csrc Size: 8658 bytes Desc: not available URL: From dagss at student.matnat.uio.no Tue Dec 7 03:51:59 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 07 Dec 2010 09:51:59 +0100 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: <4CFDF5AF.8060103@student.matnat.uio.no> On 12/07/2010 07:56 AM, Fernando Perez wrote: > Hi Skipper, > > On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold wrote: > >> I'm wondering if anyone might have a look at my cython code that does >> matrix multiplication and see where I can speed it up or offer some >> pointers/reading. I'm new to Cython and my knowledge of C is pretty >> basic based on trial and (mostly) error, so I am sure the code is >> still very naive. >> > a few years ago I had a similar problem, and I ended up getting a very > significant speedup by hand-coding a very unsafe, but very fast pure C > extension just to compute these inner products. This was basically a > replacement for dot() that would only work with double precision > inputs of compatible dimensions and would happily segfault with > anything else, but it ran very fast. The inner loop is implemented > completely naively, but it still beats calls to BLAS (even linked with > ATLAS) for small matrix dimensions (my case was also up to ~ 15x15). > Another idea: If the matrices are more in the intermediate range, here's a Cython library for calling BLAS more directly: http://www.vetta.org/2009/09/tokyo-a-cython-blas-wrapper-for-fast-matrix-math/ For intermediate-size matrices the use of SSE instructions should be able to offset any call overhead. Try to stay clear of using NumPy for slicing though, instead one should do pointer arithmetic... Dag Sverre From charlesr.harris at gmail.com Tue Dec 7 09:54:39 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 7 Dec 2010 07:54:39 -0700 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Mon, Dec 6, 2010 at 11:56 PM, Fernando Perez wrote: > Hi Skipper, > > On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold > wrote: > > I'm wondering if anyone might have a look at my cython code that does > > matrix multiplication and see where I can speed it up or offer some > > pointers/reading. I'm new to Cython and my knowledge of C is pretty > > basic based on trial and (mostly) error, so I am sure the code is > > still very naive. > > a few years ago I had a similar problem, and I ended up getting a very > significant speedup by hand-coding a very unsafe, but very fast pure C > extension just to compute these inner products. This was basically a > replacement for dot() that would only work with double precision > inputs of compatible dimensions and would happily segfault with > anything else, but it ran very fast. The inner loop is implemented > completely naively, but it still beats calls to BLAS (even linked with > ATLAS) for small matrix dimensions (my case was also up to ~ 15x15). > > I'm attaching the code in case you find it useful, please keep in mind > I haven't compiled it in years, so it may have bit-rotted a little. > > Blas adds quite a bit of overhead for multiplying small matrices, but so does calling from python. For implementing Kalman filters it might be better to write a whole Kalman class so that operations can be combined at the c level. Skipper, what kind of Kalman filter are you trying to implement? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Tue Dec 7 10:35:54 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 7 Dec 2010 07:35:54 -0800 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold wrote: > I'm wondering if anyone might have a look at my cython code that does > matrix multiplication and see where I can speed it up or offer some > pointers/reading. ?I'm new to Cython and my knowledge of C is pretty > basic based on trial and (mostly) error, so I am sure the code is > still very naive. > > > > ? ?cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE) I'd like to reduce the overhead in creating the empty array. Using PyArray_SimpleNew in Cython is faster than using np.empty but both are slower than using np.empty without Cython. Have I done something wrong? I suspect is has something to do with this line in the code below: "cdef npy_intp *dims = [r, c]" PyArray_SimpleNew: >> timeit matmult(2,2) 1000000 loops, best of 3: 773 ns per loop np.empty in cython: >> timeit matmult2(2,2) 1000000 loops, best of 3: 1.62 us per loop np.empty in python: >> timeit np.empty((2,2)) 1000000 loops, best of 3: 465 ns per loop Code: import numpy as np from numpy cimport float64_t, ndarray, NPY_DOUBLE, npy_intp ctypedef float64_t DOUBLE cdef extern from "numpy/arrayobject.h": cdef void import_array() cdef object PyArray_SimpleNew(int nd, npy_intp *dims, int typenum) # initialize numpy import_array() def matmult(int r, int c): cdef npy_intp *dims = [r, c] # Is there a faster way to do this? cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE) return out def matmult2(int r, int c): cdef ndarray[DOUBLE, ndim=2] out = np.empty((r, c), dtype=np.float64) return out From robert.kern at gmail.com Tue Dec 7 10:37:37 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 7 Dec 2010 09:37:37 -0600 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 08:54, Charles R Harris wrote: > Blas adds quite a bit of overhead for multiplying small matrices, but so > does calling from python. For implementing Kalman filters it might be better > to write a whole Kalman class so that operations can be combined at the c > level. As I said, he's writing the Kalman filter in Cython. > Skipper, what kind of Kalman filter are you trying to implement? Does this help? http://groups.google.com/group/cython-users/browse_thread/thread/a605a70626a455d?pli=1 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From bsouthey at gmail.com Tue Dec 7 16:58:41 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 07 Dec 2010 15:58:41 -0600 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: <80C1F8072EB74C4F903BC96C09FB1D6A@gmail.com> References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> <2061813969B94BD39B4BFFDABFFC2D00@gmail.com> <80C1F8072EB74C4F903BC96C09FB1D6A@gmail.com> Message-ID: <4CFEAE11.7060408@gmail.com> On 12/06/2010 10:00 PM, Peter Tittmann wrote: > > Gentlemen, > > I've decided to switch to the OLS method, thought I did get the NNLS > method that Skipper proposed working. I was not prepared to spend more > time trying to make sense of the resulting array for ancova, etc. > (also could not figure out how to interpret the resulting coefficient > array as I was expecting a 2d array representing the a and b > coefficient values but it returned a 1d array). I have hopefully > simple follow up questions: > > 1. Is there a method to define explicitly the function used in OLS? I > know numpy.linalg.lstsq is the way OLS works but is there another > function where I can define the form? > > 2. I'm still interested in interpreting the results of the NNLS > method, so if either of you can suggest what the resulting arrays mean > id be grateful. I've attached the output of NNLS > > warm regards, > > Peter > > > Here is the working version of NNLS: > > def getDiam2(ht,*b): > return b[0] * ht[:,1]**b[1] + np.sum(b[2:]*ht[:,2:], axis=1) > > dt = np.genfromtxt('/home/peter/Desktop/db_out.csv', delimiter=",", > names=True, dtype=None) > indHtPlot = adt['height'] > depDbh = adt['dbh'] > plot_dummies, col_map = sm.tools.categorical(dt['plot], drop=True, > dictnames=True) > > > def nnlsDummies(): > '''this function returns coefficients and covariance arrays''' > plot_dummies, col_map = sm.tools.categorical(indPlot, drop=True, > dictnames=True) > X = np.column_stack((indHt, plot_dummies)) > Y = depDbh > coefs, cov = curve_fit(getDiam2, X, Y, p0= [0.]*X.shape[1]) > return coefs, cov > > > -- > Peter Tittmann > > > On Monday, December 6, 2010 at 4:55 PM, josef.pktd at gmail.com wrote: > >> On Mon, Dec 6, 2010 at 7:41 PM, Skipper Seabold > > wrote: >>> On Mon, Dec 6, 2010 at 7:31 PM, Peter Tittmann wrote: >>> > thanks both of you, >>> > Josef, the data that I sent is only the first 100 rows of about >>> 1500, there >>> > should be sufficient sampling in each plot. >>> > Skipper, I have attempted to deploy your suggestion for not >>> linearizing the >>> > data. It seems to work. I'm a little confused at your modification >>> if the >>> > getDiam function and I wonder if you could help me understand. The >>> form of >>> > the equation that is being fit is: >>> > Y= a*X^b >>> > your version of the detDaim function: >>> > >>> > def getDiam(ht, *b): >>> > return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) >>> > >>> > Im sorry if this is an obvious question but I don't understand how >>> this >>> > works as it seems that the "a" coefficient is missing. >>> > Thanks again! >>> >>> Right. I took out the 'a', because as I read it when I linearized (I >>> might be misunderstanding ancova, I never recall the details), if you >>> include 'a' and also all of the dummy variables for the plot, then you >>> will have a the problem of multicollinearity. You could also include >>> 'a' and drop one of the plot dummies, but then 'a' is just your >>> reference category that you dropped. So now b[0] is the nonlinear >>> effect of your main variable and b[1:] contains linear shift effects >>> of all the plots. Hmm, thinking about it some more, though I think >>> you could include 'a' in the non-linear version above (call it b[0] >>> and shift everything else over by one), because now 'a' would be the >>> effect when the current b[0] is zero. I was just unsure how you meant >>> 'a' when you had a*ht**b and were trying to include in ht the plot >>> variable dummies. >> >> As I understand it, the intention is to estimate equality of the slope >> coefficients, so the continuous variable is multiplied with the dummy >> variables. In this case, the constant should still be added. The >> normalization question is whether to include all dummy-cont.variable >> products and drop the continuous variable, or include the continuous >> variable and drop one of the dummy-cont levels. >> >> Unless there is a strong reason to avoid log-normality of errors, I >> would work (first) with the linear version. >> >> Josef >> >> >>> >>> Skipper >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user I do think this is starting to be an off-list discussion because this is really about statistics and not about numpy/scipy (you can contact me off-list if you want). I am not sure what all the variables are so please excuse me but I presume you want to model dbh as a function of height, plot and species. Following usual biostatistics interpretation, 'plot' is probably treated as random effect but you probably have to use R/SAS etc for that for both linear and nonlinear models or some spatial models. Really you need to determine whether or not a nonlinear model is required. With the few data points you provided, I only see a linear relationship between dbh and height with some outliers and perhaps some heterogeneity of variance. Often doing a simple polynomial/spline can help to see if there is any evidence for a nonlinear relationship in the full data - a linear model or polynomial with the data provided does not suggest a nonlinear model. Obviously a linear model is easier to fit and interpret especially if you create the design matrix as estimable functions (which is rather trivial once you understand using dummy variables). The most general nonlinear/multilevel model proposed is of the form: dbh= C + A*height^B Obviously if B=1 then it is a linear model and the parameters A, B and C can be modeled with a linear function of intercept, plot and species. Although, if 'plot' is what I think it is then you probably would not model the parameters A and B with it. Without C you are forcing the curve through zero which is biological feasible if you expect dbh=0 when height is zero. However, dbh can be zero if height is not zero just due to the model itself or what dbh actually is (it may take a minimum height before dbh is greater than zero). With the data you provided, there are noticeable differences between species for dbh and height so you probably need C in your model. For this general model you probably should just fit the curve for each species alone but I would use a general stats package to do this. This will give you a good starting point to know how well the curve fits each species as well as the similarity of parameters and residual variation. Getting convergence with a model that has B varying across species may be rather hard so I would suggest modeling A and C first. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Dec 7 11:10:22 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 7 Dec 2010 09:10:22 -0700 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 8:37 AM, Robert Kern wrote: > On Tue, Dec 7, 2010 at 08:54, Charles R Harris > wrote: > > > Blas adds quite a bit of overhead for multiplying small matrices, but so > > does calling from python. For implementing Kalman filters it might be > better > > to write a whole Kalman class so that operations can be combined at the c > > level. > > As I said, he's writing the Kalman filter in Cython. > > > Skipper, what kind of Kalman filter are you trying to implement? > > Does this help? > > > http://groups.google.com/group/cython-users/browse_thread/thread/a605a70626a455d?pli=1 > > A bit, but it isn't a class. Since the Kalman filter is basically weighted linear least squares with a noisy change of variable, Skipper's function could probably be implemented that way also. Since he seems to be doing a lot of observation updates in a single go the information Kalman filter, which basically implements the usual least squares, might be a faster way to go. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Tue Dec 7 11:33:25 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 7 Dec 2010 11:33:25 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 1:56 AM, Fernando Perez wrote: > Hi Skipper, > > On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold wrote: >> I'm wondering if anyone might have a look at my cython code that does >> matrix multiplication and see where I can speed it up or offer some >> pointers/reading. ?I'm new to Cython and my knowledge of C is pretty >> basic based on trial and (mostly) error, so I am sure the code is >> still very naive. > > a few years ago I had a similar problem, and I ended up getting a very > significant speedup by hand-coding a very unsafe, but very fast pure C > extension just to compute these inner products. ?This was basically a > replacement for dot() that would only work with double precision > inputs of compatible dimensions and would happily segfault with > anything else, but it ran very fast. ?The inner loop is implemented > completely naively, but it still beats calls to BLAS (even linked with > ATLAS) for small matrix dimensions (my case was also up to ~ 15x15). > > I'm attaching the code in case you find it useful, please keep in mind > I haven't compiled it in years, so it may have bit-rotted a little. > > Cheers, > > f > Thanks. This was my next step and would've taken me some time. Skipper From josef.pktd at gmail.com Tue Dec 7 11:35:38 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 7 Dec 2010 11:35:38 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: <4CFEAE11.7060408@gmail.com> References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> <2061813969B94BD39B4BFFDABFFC2D00@gmail.com> <80C1F8072EB74C4F903BC96C09FB1D6A@gmail.com> <4CFEAE11.7060408@gmail.com> Message-ID: On Tue, Dec 7, 2010 at 4:58 PM, Bruce Southey wrote: > On 12/06/2010 10:00 PM, Peter Tittmann wrote: > > Gentlemen, > I've decided to switch to the OLS method, thought I did get the NNLS method > that Skipper proposed working. I was not prepared to spend more time trying > to make sense of the resulting array for ancova, etc. (also could not figure > out how to interpret the resulting coefficient array as I was expecting a 2d > array representing the a and b coefficient values but it returned a 1d > array). I have hopefully simple follow up questions: > 1. Is there a method to define explicitly the function used in OLS? I know > numpy.linalg.lstsq is the way OLS works but is there another function where > I can define the form? > 2. I'm still interested in interpreting the results of the NNLS method, so > if either of you can suggest what the resulting arrays mean id be grateful. > I've attached the output of NNLS > warm regards, > Peter > > Here is the working version of NNLS: > def getDiam2(ht,*b): > ?? ?return b[0] * ht[:,1]**b[1] + np.sum(b[2:]*ht[:,2:], axis=1) > dt = np.genfromtxt('/home/peter/Desktop/db_out.csv', delimiter=",", > names=True, dtype=None) > > indHtPlot = adt['height'] > depDbh = adt['dbh'] > plot_dummies, col_map = sm.tools.categorical(dt['plot], drop=True, > dictnames=True) > > def nnlsDummies(): > ?? ?'''this function returns coefficients and covariance arrays''' > ?? ?plot_dummies, col_map = sm.tools.categorical(indPlot, drop=True, > dictnames=True) > ?? ?X = np.column_stack((indHt, plot_dummies)) > ?? ?Y = depDbh > ?? ?coefs, cov = curve_fit(getDiam2, X, Y, p0= [0.]*X.shape[1]) > ?? ?return coefs, cov > > -- > Peter Tittmann > > On Monday, December 6, 2010 at 4:55 PM, josef.pktd at gmail.com wrote: > > On Mon, Dec 6, 2010 at 7:41 PM, Skipper Seabold wrote: > > On Mon, Dec 6, 2010 at 7:31 PM, Peter Tittmann wrote: >> thanks both of you, >> Josef, the data that I sent is only the first 100 rows of about 1500, >> there >> should be sufficient sampling in each plot. >> Skipper, I have attempted to deploy your suggestion for not linearizing >> the >> data. It seems to work. I'm a little confused at your modification if the >> getDiam function and I wonder if you could help me understand. The form of >> the equation that is being fit is: >> Y= a*X^b >> your version of the detDaim function: >> >> def getDiam(ht, *b): >> ? ?return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) >> >> Im sorry if this is an obvious question but I don't understand how this >> works as it seems that the "a" coefficient is missing. >> Thanks again! > > Right. ?I took out the 'a', because as I read it when I linearized (I > might be misunderstanding ancova, I never recall the details), if you > include 'a' and also all of the dummy variables for the plot, then you > will have a the problem of multicollinearity. ?You could also include > 'a' and drop one of the plot dummies, but then 'a' is just your > reference category that you dropped. ?So now b[0] is the nonlinear > effect of your main variable and b[1:] contains linear shift effects > of all the plots. ?Hmm, thinking about it some more, though I think > you could include 'a' in the non-linear version above (call it b[0] > and shift everything else over by one), because now 'a' would be the > effect when the current b[0] is zero. ?I was just unsure how you meant > 'a' when you had a*ht**b and were trying to include in ht the plot > variable dummies. > > As I understand it, the intention is to estimate equality of the slope > coefficients, so the continuous variable is multiplied with the dummy > variables. In this case, the constant should still be added. The > normalization question is whether to include all dummy-cont.variable > products and drop the continuous variable, or include the continuous > variable and drop one of the dummy-cont levels. > > Unless there is a strong reason to avoid log-normality of errors, I > would work (first) with the linear version. > > Josef > > > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > I do think this is starting to be an off-list discussion because this is > really about statistics and not about numpy/scipy (you can contact me > off-list if you want). If it's too much stats, we can continue on the statsmodels list. Last time there was a similar question, I programmed most of the tests for the linear case, and it should soon be possible to do it for the non-linear case also. The target is to be able to do this in 5 (or so) lines of code for the linear case and maybe 10 lines for the non-linear case. Josef > > I am not sure what all the variables are so please excuse me but I presume > you want to model dbh as a function of height, plot and species. Following > usual biostatistics interpretation, 'plot' is probably treated as random > effect but you probably have to use R/SAS etc for that for both linear and > nonlinear models or some spatial models. > > Really you need to determine whether or not a nonlinear model is required. > With the few data points you provided, I only see a linear relationship > between dbh and height with some outliers and perhaps some heterogeneity of > variance. Often doing a simple polynomial/spline can help to see if there is > any evidence for a nonlinear relationship in the full data - a linear model > or polynomial with the data provided does not suggest a nonlinear model. > Obviously a linear model is easier to fit and interpret especially if you > create the design matrix as estimable functions (which is rather trivial > once you understand using dummy variables). > > The most general nonlinear/multilevel model proposed is of the form: > dbh= C + A*height^B > Obviously if B=1 then it is a linear model and the parameters A, B and C can > be modeled with a linear function of intercept, plot and species. Although, > if 'plot' is what I think it is then you probably would not model the > parameters A and B with it. > > Without C you are forcing the curve through zero which is biological > feasible if you expect dbh=0 when height is zero. However, dbh can be zero > if height is not zero just due to the model itself or what dbh actually is > (it may take a minimum height before dbh is greater than zero). With the > data you provided, there are noticeable differences between species for dbh > and height so you probably need C in your model. > > For this general model you probably should just fit the curve for each > species alone but I would use a general stats package to do this. This will > give you a good starting point to know how well the curve fits each species > as well as the similarity of parameters and residual variation. Getting > convergence with a model that has B varying across species may be rather > hard so I would suggest modeling A and C first. > > Bruce > > > > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From jsseabold at gmail.com Tue Dec 7 11:37:34 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 7 Dec 2010 11:37:34 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: <4CFDF5AF.8060103@student.matnat.uio.no> References: <4CFDF5AF.8060103@student.matnat.uio.no> Message-ID: On Tue, Dec 7, 2010 at 3:51 AM, Dag Sverre Seljebotn wrote: > On 12/07/2010 07:56 AM, Fernando Perez wrote: >> Hi Skipper, >> >> On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold ?wrote: >> >>> I'm wondering if anyone might have a look at my cython code that does >>> matrix multiplication and see where I can speed it up or offer some >>> pointers/reading. ?I'm new to Cython and my knowledge of C is pretty >>> basic based on trial and (mostly) error, so I am sure the code is >>> still very naive. >>> >> a few years ago I had a similar problem, and I ended up getting a very >> significant speedup by hand-coding a very unsafe, but very fast pure C >> extension just to compute these inner products. ?This was basically a >> replacement for dot() that would only work with double precision >> inputs of compatible dimensions and would happily segfault with >> anything else, but it ran very fast. ?The inner loop is implemented >> completely naively, but it still beats calls to BLAS (even linked with >> ATLAS) for small matrix dimensions (my case was also up to ~ 15x15). >> > > Another idea: If the matrices are more in the intermediate range, here's > a Cython library for calling BLAS more directly: > > http://www.vetta.org/2009/09/tokyo-a-cython-blas-wrapper-for-fast-matrix-math/ > I actually tried to use tokyo, but I couldn't get it to build against the ATLAS I compiled a few days ago out of the box. A few changes to setup.py didn't fix it, so I gave up. > For intermediate-size matrices the use of SSE instructions should be > able to offset any call overhead. Try to stay clear of using NumPy for > slicing though, instead one should do pointer arithmetic... Right. Thanks. Skipper From jsseabold at gmail.com Tue Dec 7 11:47:21 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 7 Dec 2010 11:47:21 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 9:54 AM, Charles R Harris wrote: > > > On Mon, Dec 6, 2010 at 11:56 PM, Fernando Perez > wrote: >> >> Hi Skipper, >> >> On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold >> wrote: >> > I'm wondering if anyone might have a look at my cython code that does >> > matrix multiplication and see where I can speed it up or offer some >> > pointers/reading. ?I'm new to Cython and my knowledge of C is pretty >> > basic based on trial and (mostly) error, so I am sure the code is >> > still very naive. >> >> a few years ago I had a similar problem, and I ended up getting a very >> significant speedup by hand-coding a very unsafe, but very fast pure C >> extension just to compute these inner products. ?This was basically a >> replacement for dot() that would only work with double precision >> inputs of compatible dimensions and would happily segfault with >> anything else, but it ran very fast. ?The inner loop is implemented >> completely naively, but it still beats calls to BLAS (even linked with >> ATLAS) for small matrix dimensions (my case was also up to ~ 15x15). >> >> I'm attaching the code in case you find it useful, please keep in mind >> I haven't compiled it in years, so it may have bit-rotted a little. >> > > Blas adds quite a bit of overhead for multiplying small matrices, but so > does calling from python. For implementing Kalman filters it might be better > to write a whole Kalman class so that operations can be combined at the c > level. > > Skipper, what kind of Kalman filter are you trying to implement? > It's just a linear Gaussian filter. I use it to get the loglikelihood of a univariate ARMA series with exact initial conditions. As it stands it is fairly inflexible, but if I can make it fast I would like to generalize it. There is a fair amount of scratch work in here, and some attempts at generalized state space models, but all the action for my purposes is in KalmanFilter.loglike http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/tsa/kalmanf/kalmanf.py#L505 It's not terribly slow, but I have to maximize the likelihood using numerical derivatives, so it's getting called quite a few times. A 1000 observation ARMA(2,2) series takes about 5-6 seconds on my machine with fmin_l_bfgs_b. Skipper From dagss at student.matnat.uio.no Tue Dec 7 11:52:43 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 07 Dec 2010 17:52:43 +0100 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: <4CFE665B.70202@student.matnat.uio.no> On 12/07/2010 04:35 PM, Keith Goodman wrote: > On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold wrote: > >> I'm wondering if anyone might have a look at my cython code that does >> matrix multiplication and see where I can speed it up or offer some >> pointers/reading. I'm new to Cython and my knowledge of C is pretty >> basic based on trial and (mostly) error, so I am sure the code is >> still very naive. >> >> >> >> cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE) >> > I'd like to reduce the overhead in creating the empty array. Using > PyArray_SimpleNew in Cython is faster than using np.empty but both are > slower than using np.empty without Cython. Have I done something > wrong? I suspect is has something to do with this line in the code > below: "cdef npy_intp *dims = [r, c]" > Nope, unless something very strange is going on, that line would be ridiculously fast compared to the rest. Basically just copying two integers on the stack. Try PyArray_EMPTY? Dag Sverre > PyArray_SimpleNew: > >>> timeit matmult(2,2) >>> > 1000000 loops, best of 3: 773 ns per loop > > np.empty in cython: > >>> timeit matmult2(2,2) >>> > 1000000 loops, best of 3: 1.62 us per loop > > np.empty in python: > >>> timeit np.empty((2,2)) >>> > 1000000 loops, best of 3: 465 ns per loop > > Code: > > import numpy as np > from numpy cimport float64_t, ndarray, NPY_DOUBLE, npy_intp > > ctypedef float64_t DOUBLE > > cdef extern from "numpy/arrayobject.h": > cdef void import_array() > cdef object PyArray_SimpleNew(int nd, npy_intp *dims, int typenum) > > # initialize numpy > import_array() > > def matmult(int r, int c): > cdef npy_intp *dims = [r, c] # Is there a faster way to do this? > cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE) > return out > > def matmult2(int r, int c): > cdef ndarray[DOUBLE, ndim=2] out = np.empty((r, c), dtype=np.float64) > return out > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From charlesr.harris at gmail.com Tue Dec 7 11:53:23 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 7 Dec 2010 09:53:23 -0700 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 9:47 AM, Skipper Seabold wrote: > On Tue, Dec 7, 2010 at 9:54 AM, Charles R Harris > wrote: > > > > > > On Mon, Dec 6, 2010 at 11:56 PM, Fernando Perez > > wrote: > >> > >> Hi Skipper, > >> > >> On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold > >> wrote: > >> > I'm wondering if anyone might have a look at my cython code that does > >> > matrix multiplication and see where I can speed it up or offer some > >> > pointers/reading. I'm new to Cython and my knowledge of C is pretty > >> > basic based on trial and (mostly) error, so I am sure the code is > >> > still very naive. > >> > >> a few years ago I had a similar problem, and I ended up getting a very > >> significant speedup by hand-coding a very unsafe, but very fast pure C > >> extension just to compute these inner products. This was basically a > >> replacement for dot() that would only work with double precision > >> inputs of compatible dimensions and would happily segfault with > >> anything else, but it ran very fast. The inner loop is implemented > >> completely naively, but it still beats calls to BLAS (even linked with > >> ATLAS) for small matrix dimensions (my case was also up to ~ 15x15). > >> > >> I'm attaching the code in case you find it useful, please keep in mind > >> I haven't compiled it in years, so it may have bit-rotted a little. > >> > > > > Blas adds quite a bit of overhead for multiplying small matrices, but so > > does calling from python. For implementing Kalman filters it might be > better > > to write a whole Kalman class so that operations can be combined at the c > > level. > > > > Skipper, what kind of Kalman filter are you trying to implement? > > > > It's just a linear Gaussian filter. I use it to get the loglikelihood > of a univariate ARMA series with exact initial conditions. As it > stands it is fairly inflexible, but if I can make it fast I would like > to generalize it. > > There is a fair amount of scratch work in here, and some attempts at > generalized state space models, but all the action for my purposes is > in KalmanFilter.loglike > > > http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/tsa/kalmanf/kalmanf.py#L505 > > It's not terribly slow, but I have to maximize the likelihood using > numerical derivatives, so it's getting called quite a few times. A > 1000 observation ARMA(2,2) series takes about 5-6 seconds on my > machine with fmin_l_bfgs_b. > > Just a guess here, but the numerical derivative bit makes it sounds like you are implementing a generalized Kalman filter. Have you looked at unscented Kalman filters? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Dec 7 12:05:21 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 7 Dec 2010 12:05:21 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 11:53 AM, Charles R Harris wrote: > > > On Tue, Dec 7, 2010 at 9:47 AM, Skipper Seabold wrote: >> >> On Tue, Dec 7, 2010 at 9:54 AM, Charles R Harris >> wrote: >> > >> > >> > On Mon, Dec 6, 2010 at 11:56 PM, Fernando Perez >> > wrote: >> >> >> >> Hi Skipper, >> >> >> >> On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold >> >> wrote: >> >> > I'm wondering if anyone might have a look at my cython code that does >> >> > matrix multiplication and see where I can speed it up or offer some >> >> > pointers/reading. ?I'm new to Cython and my knowledge of C is pretty >> >> > basic based on trial and (mostly) error, so I am sure the code is >> >> > still very naive. >> >> >> >> a few years ago I had a similar problem, and I ended up getting a very >> >> significant speedup by hand-coding a very unsafe, but very fast pure C >> >> extension just to compute these inner products. ?This was basically a >> >> replacement for dot() that would only work with double precision >> >> inputs of compatible dimensions and would happily segfault with >> >> anything else, but it ran very fast. ?The inner loop is implemented >> >> completely naively, but it still beats calls to BLAS (even linked with >> >> ATLAS) for small matrix dimensions (my case was also up to ~ 15x15). >> >> >> >> I'm attaching the code in case you find it useful, please keep in mind >> >> I haven't compiled it in years, so it may have bit-rotted a little. >> >> >> > >> > Blas adds quite a bit of overhead for multiplying small matrices, but so >> > does calling from python. For implementing Kalman filters it might be >> > better >> > to write a whole Kalman class so that operations can be combined at the >> > c >> > level. >> > >> > Skipper, what kind of Kalman filter are you trying to implement? >> > >> >> It's just a linear Gaussian filter. ?I use it to get the loglikelihood >> of a univariate ARMA series with exact initial conditions. ?As it >> stands it is fairly inflexible, but if I can make it fast I would like >> to generalize it. >> >> There is a fair amount of scratch work in here, and some attempts at >> generalized state space models, but all the action for my purposes is >> in KalmanFilter.loglike >> >> >> http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/tsa/kalmanf/kalmanf.py#L505 >> >> It's not terribly slow, but I have to maximize the likelihood using >> numerical derivatives, so it's getting called quite a few times. ?A >> 1000 observation ARMA(2,2) series takes about 5-6 seconds on my >> machine with fmin_l_bfgs_b. >> > > Just a guess here, but the numerical derivative bit makes it sounds like you > are implementing a generalized Kalman filter. Have you looked at unscented > Kalman filters? It's still a linear filter, non-linear optimization comes in because the exact loglikelihood function for ARMA is non-linear in the coefficients. (There might be a way to calculate the derivative in the same loop, but that's a different issue.) Josef > > Chuck > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From kwgoodman at gmail.com Tue Dec 7 12:08:51 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 7 Dec 2010 09:08:51 -0800 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: <4CFE665B.70202@student.matnat.uio.no> References: <4CFE665B.70202@student.matnat.uio.no> Message-ID: On Tue, Dec 7, 2010 at 8:52 AM, Dag Sverre Seljebotn wrote: > On 12/07/2010 04:35 PM, Keith Goodman wrote: >> On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold ?wrote: >> >>> I'm wondering if anyone might have a look at my cython code that does >>> matrix multiplication and see where I can speed it up or offer some >>> pointers/reading. ?I'm new to Cython and my knowledge of C is pretty >>> basic based on trial and (mostly) error, so I am sure the code is >>> still very naive. >>> >>> >>> >>> ? ? cdef ndarray[DOUBLE, ndim=2] out = PyArray_SimpleNew(2, dims, NPY_DOUBLE) >>> >> I'd like to reduce the overhead in creating the empty array. Using >> PyArray_SimpleNew in Cython is faster than using np.empty but both are >> slower than using np.empty without Cython. Have I done something >> wrong? I suspect is has something to do with this line in the code >> below: "cdef npy_intp *dims = [r, c]" >> > > Nope, unless something very strange is going on, that line would be > ridiculously fast compared to the rest. Basically just copying two > integers on the stack. > > Try PyArray_EMPTY? PyArray_EMPTY is a little faster (but np.empty is still much faster): PyArray_SimpleNew >> timeit matmult(2,2) 1000000 loops, best of 3: 778 ns per loop PyArray_EMPTY >> timeit matmult3(2,2) 1000000 loops, best of 3: 763 ns per loop np.empty in python >> timeit np.empty((2,2)) 1000000 loops, best of 3: 470 ns per loop def matmult3(int r, int c): cdef npy_intp *dims = [r, c] cdef ndarray[DOUBLE, ndim=2] out = PyArray_EMPTY(2, dims, NPY_FLOAT64, 0) return out From jsseabold at gmail.com Tue Dec 7 12:15:46 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 7 Dec 2010 12:15:46 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 12:05 PM, wrote: > On Tue, Dec 7, 2010 at 11:53 AM, Charles R Harris > wrote: >> >> >> On Tue, Dec 7, 2010 at 9:47 AM, Skipper Seabold wrote: >>> >>> On Tue, Dec 7, 2010 at 9:54 AM, Charles R Harris >>> wrote: >>> > >>> > >>> > On Mon, Dec 6, 2010 at 11:56 PM, Fernando Perez >>> > wrote: >>> >> >>> >> Hi Skipper, >>> >> >>> >> On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold >>> >> wrote: >>> >> > I'm wondering if anyone might have a look at my cython code that does >>> >> > matrix multiplication and see where I can speed it up or offer some >>> >> > pointers/reading. ?I'm new to Cython and my knowledge of C is pretty >>> >> > basic based on trial and (mostly) error, so I am sure the code is >>> >> > still very naive. >>> >> >>> >> a few years ago I had a similar problem, and I ended up getting a very >>> >> significant speedup by hand-coding a very unsafe, but very fast pure C >>> >> extension just to compute these inner products. ?This was basically a >>> >> replacement for dot() that would only work with double precision >>> >> inputs of compatible dimensions and would happily segfault with >>> >> anything else, but it ran very fast. ?The inner loop is implemented >>> >> completely naively, but it still beats calls to BLAS (even linked with >>> >> ATLAS) for small matrix dimensions (my case was also up to ~ 15x15). >>> >> >>> >> I'm attaching the code in case you find it useful, please keep in mind >>> >> I haven't compiled it in years, so it may have bit-rotted a little. >>> >> >>> > >>> > Blas adds quite a bit of overhead for multiplying small matrices, but so >>> > does calling from python. For implementing Kalman filters it might be >>> > better >>> > to write a whole Kalman class so that operations can be combined at the >>> > c >>> > level. >>> > >>> > Skipper, what kind of Kalman filter are you trying to implement? >>> > >>> >>> It's just a linear Gaussian filter. ?I use it to get the loglikelihood >>> of a univariate ARMA series with exact initial conditions. ?As it >>> stands it is fairly inflexible, but if I can make it fast I would like >>> to generalize it. >>> >>> There is a fair amount of scratch work in here, and some attempts at >>> generalized state space models, but all the action for my purposes is >>> in KalmanFilter.loglike >>> >>> >>> http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/tsa/kalmanf/kalmanf.py#L505 >>> >>> It's not terribly slow, but I have to maximize the likelihood using >>> numerical derivatives, so it's getting called quite a few times. ?A >>> 1000 observation ARMA(2,2) series takes about 5-6 seconds on my >>> machine with fmin_l_bfgs_b. >>> >> >> Just a guess here, but the numerical derivative bit makes it sounds like you >> are implementing a generalized Kalman filter. Have you looked at unscented >> Kalman filters? > > It's still a linear filter, non-linear optimization comes in because > the exact loglikelihood function for ARMA is non-linear in the > coefficients. > (There might be a way to calculate the derivative in the same loop, > but that's a different issue.) > Right. The derivative is of the whole likelihood function with respect to the parameters that make up the system matrices in the state equation. You can calculate the derivatives as part of the recursions but the literature I've seen suggests that numerical derivatives of the state equation matrices are the way to go. Skipper From charlesr.harris at gmail.com Tue Dec 7 12:17:05 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 7 Dec 2010 10:17:05 -0700 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 10:05 AM, wrote: > On Tue, Dec 7, 2010 at 11:53 AM, Charles R Harris > wrote: > > > > > > On Tue, Dec 7, 2010 at 9:47 AM, Skipper Seabold > wrote: > >> > >> On Tue, Dec 7, 2010 at 9:54 AM, Charles R Harris > >> wrote: > >> > > >> > > >> > On Mon, Dec 6, 2010 at 11:56 PM, Fernando Perez > > >> > wrote: > >> >> > >> >> Hi Skipper, > >> >> > >> >> On Mon, Dec 6, 2010 at 2:34 PM, Skipper Seabold > > >> >> wrote: > >> >> > I'm wondering if anyone might have a look at my cython code that > does > >> >> > matrix multiplication and see where I can speed it up or offer some > >> >> > pointers/reading. I'm new to Cython and my knowledge of C is > pretty > >> >> > basic based on trial and (mostly) error, so I am sure the code is > >> >> > still very naive. > >> >> > >> >> a few years ago I had a similar problem, and I ended up getting a > very > >> >> significant speedup by hand-coding a very unsafe, but very fast pure > C > >> >> extension just to compute these inner products. This was basically a > >> >> replacement for dot() that would only work with double precision > >> >> inputs of compatible dimensions and would happily segfault with > >> >> anything else, but it ran very fast. The inner loop is implemented > >> >> completely naively, but it still beats calls to BLAS (even linked > with > >> >> ATLAS) for small matrix dimensions (my case was also up to ~ 15x15). > >> >> > >> >> I'm attaching the code in case you find it useful, please keep in > mind > >> >> I haven't compiled it in years, so it may have bit-rotted a little. > >> >> > >> > > >> > Blas adds quite a bit of overhead for multiplying small matrices, but > so > >> > does calling from python. For implementing Kalman filters it might be > >> > better > >> > to write a whole Kalman class so that operations can be combined at > the > >> > c > >> > level. > >> > > >> > Skipper, what kind of Kalman filter are you trying to implement? > >> > > >> > >> It's just a linear Gaussian filter. I use it to get the loglikelihood > >> of a univariate ARMA series with exact initial conditions. As it > >> stands it is fairly inflexible, but if I can make it fast I would like > >> to generalize it. > >> > >> There is a fair amount of scratch work in here, and some attempts at > >> generalized state space models, but all the action for my purposes is > >> in KalmanFilter.loglike > >> > >> > >> > http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper/annotate/head%3A/scikits/statsmodels/tsa/kalmanf/kalmanf.py#L505 > >> > >> It's not terribly slow, but I have to maximize the likelihood using > >> numerical derivatives, so it's getting called quite a few times. A > >> 1000 observation ARMA(2,2) series takes about 5-6 seconds on my > >> machine with fmin_l_bfgs_b. > >> > > > > Just a guess here, but the numerical derivative bit makes it sounds like > you > > are implementing a generalized Kalman filter. Have you looked at > unscented > > Kalman filters? > > It's still a linear filter, non-linear optimization comes in because > the exact loglikelihood function for ARMA is non-linear in the > coefficients. > (There might be a way to calculate the derivative in the same loop, > but that's a different issue.) > > The unscented Kalman filter is a better way to estimate the covariance of a non-linear process, think of it as a better integrator. If the propagation is easy to compute, which seems to be the case here, it will probably save you some time. You might even be able to use the basic idea and skip the Kalman part altogether. My general aim here is to optimize the algorithm first before getting caught up in the details of matrix multiplication in c. Premature optimization and all that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Tue Dec 7 12:39:28 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 7 Dec 2010 12:39:28 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 12:17 PM, Charles R Harris wrote: > > > On Tue, Dec 7, 2010 at 10:05 AM, wrote: >> It's still a linear filter, non-linear optimization comes in because >> the exact loglikelihood function for ARMA is non-linear in the >> coefficients. >> (There might be a way to calculate the derivative in the same loop, >> but that's a different issue.) >> > > The unscented Kalman filter is a better way to estimate the covariance of a > non-linear process, think of it as a better integrator. If the propagation > is easy to compute, which seems to be the case here, it will probably save > you some time. You might even be able to use the basic idea and skip the > Kalman part altogether. > > My general aim here is to optimize the algorithm first before getting caught > up in the details of matrix multiplication in c. Premature optimization and > all that. > Hmm I haven't seen this mentioned much in what I've been reading or the documentation on existing software for ARMA processes, so I never thought much about it. I will have a closer look. Well, google turns up this thread... There is another optimization that I could employ by switching to fast recursions when the state variance converges to its steady state, but this makes it less general for future enhancements (ie., time varying coefficients). Maybe I will go ahead and try it. Skipper From seb.haase at gmail.com Tue Dec 7 13:00:11 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Tue, 7 Dec 2010 19:00:11 +0100 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> <2061813969B94BD39B4BFFDABFFC2D00@gmail.com> <80C1F8072EB74C4F903BC96C09FB1D6A@gmail.com> <4CFEAE11.7060408@gmail.com> Message-ID: On Tue, Dec 7, 2010 at 5:35 PM, wrote: > On Tue, Dec 7, 2010 at 4:58 PM, Bruce Southey wrote: >> On 12/06/2010 10:00 PM, Peter Tittmann wrote: >> >> Gentlemen, >> I've decided to switch to the OLS method, thought I did get the NNLS method >> that Skipper proposed working. I was not prepared to spend more time trying >> to make sense of the resulting array for ancova, etc. (also could not figure >> out how to interpret the resulting coefficient array as I was expecting a 2d >> array representing the a and b coefficient values but it returned a 1d >> array). I have hopefully simple follow up questions: >> 1. Is there a method to define explicitly the function used in OLS? I know >> numpy.linalg.lstsq is the way OLS works but is there another function where >> I can define the form? >> 2. I'm still interested in interpreting the results of the NNLS method, so >> if either of you can suggest what the resulting arrays mean id be grateful. >> I've attached the output of NNLS >> warm regards, >> Peter >> >> Here is the working version of NNLS: >> def getDiam2(ht,*b): >> ?? ?return b[0] * ht[:,1]**b[1] + np.sum(b[2:]*ht[:,2:], axis=1) >> dt = np.genfromtxt('/home/peter/Desktop/db_out.csv', delimiter=",", >> names=True, dtype=None) >> >> indHtPlot = adt['height'] >> depDbh = adt['dbh'] >> plot_dummies, col_map = sm.tools.categorical(dt['plot], drop=True, >> dictnames=True) >> >> def nnlsDummies(): >> ?? ?'''this function returns coefficients and covariance arrays''' >> ?? ?plot_dummies, col_map = sm.tools.categorical(indPlot, drop=True, >> dictnames=True) >> ?? ?X = np.column_stack((indHt, plot_dummies)) >> ?? ?Y = depDbh >> ?? ?coefs, cov = curve_fit(getDiam2, X, Y, p0= [0.]*X.shape[1]) >> ?? ?return coefs, cov >> >> -- >> Peter Tittmann >> >> On Monday, December 6, 2010 at 4:55 PM, josef.pktd at gmail.com wrote: >> >> On Mon, Dec 6, 2010 at 7:41 PM, Skipper Seabold wrote: >> >> On Mon, Dec 6, 2010 at 7:31 PM, Peter Tittmann wrote: >>> thanks both of you, >>> Josef, the data that I sent is only the first 100 rows of about 1500, >>> there >>> should be sufficient sampling in each plot. >>> Skipper, I have attempted to deploy your suggestion for not linearizing >>> the >>> data. It seems to work. I'm a little confused at your modification if the >>> getDiam function and I wonder if you could help me understand. The form of >>> the equation that is being fit is: >>> Y= a*X^b >>> your version of the detDaim function: >>> >>> def getDiam(ht, *b): >>> ? ?return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) >>> >>> Im sorry if this is an obvious question but I don't understand how this >>> works as it seems that the "a" coefficient is missing. >>> Thanks again! >> >> Right. ?I took out the 'a', because as I read it when I linearized (I >> might be misunderstanding ancova, I never recall the details), if you >> include 'a' and also all of the dummy variables for the plot, then you >> will have a the problem of multicollinearity. ?You could also include >> 'a' and drop one of the plot dummies, but then 'a' is just your >> reference category that you dropped. ?So now b[0] is the nonlinear >> effect of your main variable and b[1:] contains linear shift effects >> of all the plots. ?Hmm, thinking about it some more, though I think >> you could include 'a' in the non-linear version above (call it b[0] >> and shift everything else over by one), because now 'a' would be the >> effect when the current b[0] is zero. ?I was just unsure how you meant >> 'a' when you had a*ht**b and were trying to include in ht the plot >> variable dummies. >> >> As I understand it, the intention is to estimate equality of the slope >> coefficients, so the continuous variable is multiplied with the dummy >> variables. In this case, the constant should still be added. The >> normalization question is whether to include all dummy-cont.variable >> products and drop the continuous variable, or include the continuous >> variable and drop one of the dummy-cont levels. >> >> Unless there is a strong reason to avoid log-normality of errors, I >> would work (first) with the linear version. >> >> Josef >> >> >> >> Skipper >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> I do think this is starting to be an off-list discussion because this is >> really about statistics and not about numpy/scipy (you can contact me >> off-list if you want). > > If it's too much stats, we can continue on the statsmodels list. Last > time there was a similar question, I programmed most of the tests for > the linear case, and it should soon be possible to do it for the > non-linear case also. > The target is to be able to do this in 5 (or so) lines of code for the > linear case and maybe 10 lines for the non-linear case. > > Josef > >> >> I am not sure what all the variables are so please excuse me but I presume >> you want to model dbh as a function of height, plot and species. Following >> usual biostatistics interpretation, 'plot' is probably treated as random >> effect but you probably have to use R/SAS etc for that for both linear and >> nonlinear models or some spatial models. >> >> Really you need to determine whether or not a nonlinear model is required. >> With the few data points you provided, I only see a linear relationship >> between dbh and height with some outliers and perhaps some heterogeneity of >> variance. Often doing a simple polynomial/spline can help to see if there is >> any evidence for a nonlinear relationship in the full data - a linear model >> or polynomial with the data provided does not suggest a nonlinear model. >> Obviously a linear model is easier to fit and interpret especially if you >> create the design matrix as estimable functions (which is rather trivial >> once you understand using dummy variables). >> >> The most general nonlinear/multilevel model proposed is of the form: >> dbh= C + A*height^B >> Obviously if B=1 then it is a linear model and the parameters A, B and C can >> be modeled with a linear function of intercept, plot and species. Although, >> if 'plot' is what I think it is then you probably would not model the >> parameters A and B with it. >> >> Without C you are forcing the curve through zero which is biological >> feasible if you expect dbh=0 when height is zero. However, dbh can be zero >> if height is not zero just due to the model itself or what dbh actually is >> (it may take a minimum height before dbh is greater than zero). With the >> data you provided, there are noticeable differences between species for dbh >> and height so you probably need C in your model. >> >> For this general model you probably should just fit the curve for each >> species alone but I would use a general stats package to do this. This will >> give you a good starting point to know how well the curve fits each species >> as well as the similarity of parameters and residual variation. Getting >> convergence with a model that has B varying across species may be rather >> hard so I would suggest modeling A and C first. >> >> Bruce >> I have never heard of a "statsmodels lists" -- where and what is that !? Thanks, - Sebastian Haase From jsseabold at gmail.com Tue Dec 7 13:06:21 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 7 Dec 2010 13:06:21 -0500 Subject: [SciPy-User] ancova with optimize.curve_fit In-Reply-To: References: <2EDB4EB07FAA4F78B5ED0C2801524D5F@gmail.com> <2061813969B94BD39B4BFFDABFFC2D00@gmail.com> <80C1F8072EB74C4F903BC96C09FB1D6A@gmail.com> <4CFEAE11.7060408@gmail.com> Message-ID: On Tue, Dec 7, 2010 at 1:00 PM, Sebastian Haase wrote: > On Tue, Dec 7, 2010 at 5:35 PM, ? wrote: >> On Tue, Dec 7, 2010 at 4:58 PM, Bruce Southey wrote: >>> On 12/06/2010 10:00 PM, Peter Tittmann wrote: >>> >>> Gentlemen, >>> I've decided to switch to the OLS method, thought I did get the NNLS method >>> that Skipper proposed working. I was not prepared to spend more time trying >>> to make sense of the resulting array for ancova, etc. (also could not figure >>> out how to interpret the resulting coefficient array as I was expecting a 2d >>> array representing the a and b coefficient values but it returned a 1d >>> array). I have hopefully simple follow up questions: >>> 1. Is there a method to define explicitly the function used in OLS? I know >>> numpy.linalg.lstsq is the way OLS works but is there another function where >>> I can define the form? >>> 2. I'm still interested in interpreting the results of the NNLS method, so >>> if either of you can suggest what the resulting arrays mean id be grateful. >>> I've attached the output of NNLS >>> warm regards, >>> Peter >>> >>> Here is the working version of NNLS: >>> def getDiam2(ht,*b): >>> ?? ?return b[0] * ht[:,1]**b[1] + np.sum(b[2:]*ht[:,2:], axis=1) >>> dt = np.genfromtxt('/home/peter/Desktop/db_out.csv', delimiter=",", >>> names=True, dtype=None) >>> >>> indHtPlot = adt['height'] >>> depDbh = adt['dbh'] >>> plot_dummies, col_map = sm.tools.categorical(dt['plot], drop=True, >>> dictnames=True) >>> >>> def nnlsDummies(): >>> ?? ?'''this function returns coefficients and covariance arrays''' >>> ?? ?plot_dummies, col_map = sm.tools.categorical(indPlot, drop=True, >>> dictnames=True) >>> ?? ?X = np.column_stack((indHt, plot_dummies)) >>> ?? ?Y = depDbh >>> ?? ?coefs, cov = curve_fit(getDiam2, X, Y, p0= [0.]*X.shape[1]) >>> ?? ?return coefs, cov >>> >>> -- >>> Peter Tittmann >>> >>> On Monday, December 6, 2010 at 4:55 PM, josef.pktd at gmail.com wrote: >>> >>> On Mon, Dec 6, 2010 at 7:41 PM, Skipper Seabold wrote: >>> >>> On Mon, Dec 6, 2010 at 7:31 PM, Peter Tittmann wrote: >>>> thanks both of you, >>>> Josef, the data that I sent is only the first 100 rows of about 1500, >>>> there >>>> should be sufficient sampling in each plot. >>>> Skipper, I have attempted to deploy your suggestion for not linearizing >>>> the >>>> data. It seems to work. I'm a little confused at your modification if the >>>> getDiam function and I wonder if you could help me understand. The form of >>>> the equation that is being fit is: >>>> Y= a*X^b >>>> your version of the detDaim function: >>>> >>>> def getDiam(ht, *b): >>>> ? ?return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1) >>>> >>>> Im sorry if this is an obvious question but I don't understand how this >>>> works as it seems that the "a" coefficient is missing. >>>> Thanks again! >>> >>> Right. ?I took out the 'a', because as I read it when I linearized (I >>> might be misunderstanding ancova, I never recall the details), if you >>> include 'a' and also all of the dummy variables for the plot, then you >>> will have a the problem of multicollinearity. ?You could also include >>> 'a' and drop one of the plot dummies, but then 'a' is just your >>> reference category that you dropped. ?So now b[0] is the nonlinear >>> effect of your main variable and b[1:] contains linear shift effects >>> of all the plots. ?Hmm, thinking about it some more, though I think >>> you could include 'a' in the non-linear version above (call it b[0] >>> and shift everything else over by one), because now 'a' would be the >>> effect when the current b[0] is zero. ?I was just unsure how you meant >>> 'a' when you had a*ht**b and were trying to include in ht the plot >>> variable dummies. >>> >>> As I understand it, the intention is to estimate equality of the slope >>> coefficients, so the continuous variable is multiplied with the dummy >>> variables. In this case, the constant should still be added. The >>> normalization question is whether to include all dummy-cont.variable >>> products and drop the continuous variable, or include the continuous >>> variable and drop one of the dummy-cont levels. >>> >>> Unless there is a strong reason to avoid log-normality of errors, I >>> would work (first) with the linear version. >>> >>> Josef >>> >>> >>> >>> Skipper >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> I do think this is starting to be an off-list discussion because this is >>> really about statistics and not about numpy/scipy (you can contact me >>> off-list if you want). >> >> If it's too much stats, we can continue on the statsmodels list. Last >> time there was a similar question, I programmed most of the tests for >> the linear case, and it should soon be possible to do it for the >> non-linear case also. >> The target is to be able to do this in 5 (or so) lines of code for the >> linear case and maybe 10 lines for the non-linear case. >> >> Josef >> >>> >>> I am not sure what all the variables are so please excuse me but I presume >>> you want to model dbh as a function of height, plot and species. Following >>> usual biostatistics interpretation, 'plot' is probably treated as random >>> effect but you probably have to use R/SAS etc for that for both linear and >>> nonlinear models or some spatial models. >>> >>> Really you need to determine whether or not a nonlinear model is required. >>> With the few data points you provided, I only see a linear relationship >>> between dbh and height with some outliers and perhaps some heterogeneity of >>> variance. Often doing a simple polynomial/spline can help to see if there is >>> any evidence for a nonlinear relationship in the full data - a linear model >>> or polynomial with the data provided does not suggest a nonlinear model. >>> Obviously a linear model is easier to fit and interpret especially if you >>> create the design matrix as estimable functions (which is rather trivial >>> once you understand using dummy variables). >>> >>> The most general nonlinear/multilevel model proposed is of the form: >>> dbh= C + A*height^B >>> Obviously if B=1 then it is a linear model and the parameters A, B and C can >>> be modeled with a linear function of intercept, plot and species. Although, >>> if 'plot' is what I think it is then you probably would not model the >>> parameters A and B with it. >>> >>> Without C you are forcing the curve through zero which is biological >>> feasible if you expect dbh=0 when height is zero. However, dbh can be zero >>> if height is not zero just due to the model itself or what dbh actually is >>> (it may take a minimum height before dbh is greater than zero). With the >>> data you provided, there are noticeable differences between species for dbh >>> and height so you probably need C in your model. >>> >>> For this general model you probably should just fit the curve for each >>> species alone but I would use a general stats package to do this. This will >>> give you a good starting point to know how well the curve fits each species >>> as well as the similarity of parameters and residual variation. Getting >>> convergence with a model that has B varying across species may be rather >>> hard so I would suggest modeling A and C first. >>> >>> Bruce >>> > I have never heard of a "statsmodels lists" -- where and what is that !? > Thanks, > - Sebastian Haase http://groups.google.com/group/pystatsmodels/ Mostly discussion on the statsmodels scikit and stats with Python, including development chatter so the signal to noise ratio might be low... Skipper From xunchen.liu at gmail.com Tue Dec 7 14:25:25 2010 From: xunchen.liu at gmail.com (Xunchen Liu) Date: Tue, 7 Dec 2010 12:25:25 -0700 Subject: [SciPy-User] least square fit with weight in linalg.lstsq Message-ID: Hello, I don't know if there is option to have weighting in least square fit? attached is the code I use linalg.lstsq to fit 4 parameters out of 6 measurement. Also, how can I read the uncertainty of each parameters fitted? Thanks a lot! Xunchen ============== freq = np.transpose(np.array ([ (3609.283, 3611.204, 3611.897, 3619.310, 3620.270, 3620.98 )])) JK =np.array ([ (1,-2,0,4), (1,2,0,-4), (1,4,0,-32), (1,0,1,0), (1,2,1,-4), (1,4,1,-32)]) c,resid,rank,sigma = linalg.lstsq(JK,freq) ============== -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Dec 7 14:33:36 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 7 Dec 2010 12:33:36 -0700 Subject: [SciPy-User] least square fit with weight in linalg.lstsq In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 12:25 PM, Xunchen Liu wrote: > Hello, > > I don't know if there is option to have weighting in least square fit? > attached is the code I use linalg.lstsq to fit 4 parameters out of 6 > measurement. > Also, how can I read the uncertainty of each parameters fitted? > > There is no option for weights at the moment, nor for the covariance of the fitted parameters. We should fix that up someday, but at the moment you need to write your own implementation. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From laserson at mit.edu Tue Dec 7 15:06:25 2010 From: laserson at mit.edu (Uri Laserson) Date: Tue, 7 Dec 2010 15:06:25 -0500 Subject: [SciPy-User] Problems installing scipy on OS X 10.6 (Snow Leopard): libarpack Message-ID: Hi all, I am on a MacMini with Intel processor. I just installed OS X 10.6 and the latest Xcode that I could download, which included gcc 4.2. I am using python 2.7 built from source using homebrew. I installed the gfortran 4.2.3 binaries from http://r.research.att.com/tools/. I am trying to install numpy and scipy. numpy installs fine with or without switching to g++-4.0. I have successfully installed it using pip and also directly from source from the git repository. Scipy is giving me errors on install (the same errors whether I use pip or try the svn repository). I installed it successfully yesterday on a new Macbook Air using pip, after changing the symlinks to point to g++-4.0. However, today on my MacMini, I am getting errors after following the same protocol. The errors I am getting are here: https://gist.github.com/732293 I also don't know why it's referencing temp.macosx-10.4-x86_64-2.7 while I am on 10.6. Please help! Thanks! Uri ................................................................................... Uri Laserson Graduate Student, Biomedical Engineering Harvard-MIT Division of Health Sciences and Technology M +1 917 742 8019 laserson at mit.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Dec 7 15:51:52 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 7 Dec 2010 15:51:52 -0500 Subject: [SciPy-User] least square fit with weight in linalg.lstsq In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 2:33 PM, Charles R Harris wrote: > > > On Tue, Dec 7, 2010 at 12:25 PM, Xunchen Liu wrote: >> >> Hello, >> I don't know if there is option to have weighting in least square fit? >> attached is the code I use linalg.lstsq to fit 4 parameters out of 6 >> measurement. >> Also, how can I read the uncertainty of each parameters fitted? > > There is no option for weights at the moment, nor for the covariance of the > fitted parameters. We should fix that up someday, but at the moment you need > to write your own implementation. In writing your own implementation, you could use optimize curve_fit as a pattern for the weight handling. Or you just use scikits.statsmodels.WLS http://pypi.python.org/pypi/scikits.statsmodels Josef > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From gus.is.here at gmail.com Tue Dec 7 16:26:59 2010 From: gus.is.here at gmail.com (Gus Ishere) Date: Tue, 7 Dec 2010 16:26:59 -0500 Subject: [SciPy-User] Speeding up integrate.odeint with weave/blitz Message-ID: I currently have a call like: integrate.odeint(dX_dt,X0,t,full_output=True) One approach to speed up the integration time is to make dX_dt into a weave function. Is there a better way to speed it up and avoid the function call overhead in Python? Any examples would be very appreciated. dX_dt is very simple. Thanks, Gustavo -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.clewley at gmail.com Tue Dec 7 17:48:52 2010 From: rob.clewley at gmail.com (Rob Clewley) Date: Tue, 7 Dec 2010 17:48:52 -0500 Subject: [SciPy-User] Speeding up integrate.odeint with weave/blitz In-Reply-To: References: Message-ID: Gus, On Tue, Dec 7, 2010 at 4:26 PM, Gus Ishere wrote: > I currently have a call like:?integrate.odeint(dX_dt,X0,t,full_output=True) > One approach to speed up the integration time is to make dX_dt into a weave > function. Is there a better way to speed it up and avoid the function call > overhead in Python? Any examples would be very appreciated. > dX_dt is very simple. There are different compromises in the different possible approaches, I wouldn't say any is "best". One approach is to use my PyDSTool package (pydstool.sourceforge.net), which I think will save you work and provide powerful options in the long term but requires some extra installation and learning curve for the more sophisticated user interface. It will automatically compile the dX_dt function you specify into C and link it with the C integrator. So there is very little overhead in interaction between your function and the integrator, and once my code is installed this happens quite transparently to the user. PyDSTool even gives you the opportunity to hand-optimize the C code for your function before final compilation, if that floats your speedboat. -- Robert Clewley, Ph.D. Assistant Professor Neuroscience Institute and Department of Mathematics and Statistics Georgia State University PO Box 5030 Atlanta, GA 30302, USA tel: 404-413-6420 fax: 404-413-5446 http://www2.gsu.edu/~matrhc http://neuroscience.gsu.edu/rclewley.html From charlesr.harris at gmail.com Tue Dec 7 18:45:22 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 7 Dec 2010 16:45:22 -0700 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 10:39 AM, Skipper Seabold wrote: > On Tue, Dec 7, 2010 at 12:17 PM, Charles R Harris > wrote: > > > > > > On Tue, Dec 7, 2010 at 10:05 AM, wrote: > > > > >> It's still a linear filter, non-linear optimization comes in because > >> the exact loglikelihood function for ARMA is non-linear in the > >> coefficients. > >> (There might be a way to calculate the derivative in the same loop, > >> but that's a different issue.) > >> > > > > The unscented Kalman filter is a better way to estimate the covariance of > a > > non-linear process, think of it as a better integrator. If the > propagation > > is easy to compute, which seems to be the case here, it will probably > save > > you some time. You might even be able to use the basic idea and skip the > > Kalman part altogether. > > > > My general aim here is to optimize the algorithm first before getting > caught > > up in the details of matrix multiplication in c. Premature optimization > and > > all that. > > > > Hmm I haven't seen this mentioned much in what I've been reading or > the documentation on existing software for ARMA processes, so I never > thought much about it. I will have a closer look. Well, google turns > up this thread... > > I've started reading up a bit on what you are doing and the application doesn't use extended Kalman filters, so the suggestion to use unscented Kalman filters is irrelevant. Sorry about that ;) I'm still wading through the various statistical notation thickets to see if there might be a better form to use for the problem but I don't see one at the moment. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Dec 7 19:09:12 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 7 Dec 2010 19:09:12 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 6:45 PM, Charles R Harris wrote: > > > On Tue, Dec 7, 2010 at 10:39 AM, Skipper Seabold > wrote: >> >> On Tue, Dec 7, 2010 at 12:17 PM, Charles R Harris >> wrote: >> > >> > >> > On Tue, Dec 7, 2010 at 10:05 AM, wrote: >> >> >> >> >> It's still a linear filter, non-linear optimization comes in because >> >> the exact loglikelihood function for ARMA is non-linear in the >> >> coefficients. >> >> (There might be a way to calculate the derivative in the same loop, >> >> but that's a different issue.) >> >> >> > >> > The unscented Kalman filter is a better way to estimate the covariance >> > of a >> > non-linear process, think of it as a better integrator. If the >> > propagation >> > is easy to compute, which seems to be the case here, it will probably >> > save >> > you some time. You might even be able to use the basic idea and skip the >> > Kalman part altogether. >> > >> > My general aim here is to optimize the algorithm first before getting >> > caught >> > up in the details of matrix multiplication in c. Premature optimization >> > and >> > all that. >> > >> >> Hmm I haven't seen this mentioned much in what I've been reading or >> the documentation on existing software for ARMA processes, so I never >> thought much about it. ?I will have a closer look. ?Well, google turns >> up this thread... >> > > I've started reading up a bit on what you are doing and the application > doesn't use extended Kalman filters, so the suggestion to use unscented > Kalman filters is irrelevant. Sorry about that ;) I'm still wading through > the various statistical notation thickets to see if there might be a better > form to use for the problem but I don't see one at the moment. There are faster ways to get the likelihood for a simple ARMA process than using a Kalman Filter. I think the main advantage and reason for the popularity of Kalman Filter for this is that it is easier to extend. So using too many tricks that are specific to the simple ARMA might take away much of the advantage of getting a fast Kalman Filter. I didn't read much of the details for the Kalman Filter for this, but that was my conclusion from the non-Kalman Filter literature. Josef > > Chuck > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From charlesr.harris at gmail.com Tue Dec 7 19:37:31 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 7 Dec 2010 17:37:31 -0700 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 5:09 PM, wrote: > On Tue, Dec 7, 2010 at 6:45 PM, Charles R Harris > wrote: > > > > > > On Tue, Dec 7, 2010 at 10:39 AM, Skipper Seabold > > wrote: > >> > >> On Tue, Dec 7, 2010 at 12:17 PM, Charles R Harris > >> wrote: > >> > > >> > > >> > On Tue, Dec 7, 2010 at 10:05 AM, wrote: > >> > >> > >> > >> >> It's still a linear filter, non-linear optimization comes in because > >> >> the exact loglikelihood function for ARMA is non-linear in the > >> >> coefficients. > >> >> (There might be a way to calculate the derivative in the same loop, > >> >> but that's a different issue.) > >> >> > >> > > >> > The unscented Kalman filter is a better way to estimate the covariance > >> > of a > >> > non-linear process, think of it as a better integrator. If the > >> > propagation > >> > is easy to compute, which seems to be the case here, it will probably > >> > save > >> > you some time. You might even be able to use the basic idea and skip > the > >> > Kalman part altogether. > >> > > >> > My general aim here is to optimize the algorithm first before getting > >> > caught > >> > up in the details of matrix multiplication in c. Premature > optimization > >> > and > >> > all that. > >> > > >> > >> Hmm I haven't seen this mentioned much in what I've been reading or > >> the documentation on existing software for ARMA processes, so I never > >> thought much about it. I will have a closer look. Well, google turns > >> up this thread... > >> > > > > I've started reading up a bit on what you are doing and the application > > doesn't use extended Kalman filters, so the suggestion to use unscented > > Kalman filters is irrelevant. Sorry about that ;) I'm still wading > through > > the various statistical notation thickets to see if there might be a > better > > form to use for the problem but I don't see one at the moment. > > There are faster ways to get the likelihood for a simple ARMA process > than using a Kalman Filter. I think the main advantage and reason for > the popularity of Kalman Filter for this is that it is easier to > extend. So using too many tricks that are specific to the simple ARMA > might take away much of the advantage of getting a fast Kalman Filter. > > I didn't read much of the details for the Kalman Filter for this, but > that was my conclusion from the non-Kalman Filter literature. > > Well, there are five forms of the standard Kalman filter that I am somewhat familiar with and some are better suited to some applications than others. But at this point I don't see that there is any reason not to use the common form for the ARMA case. It would be interesting to see some profiling since the matrix inversions are likely to dominate as the number of variables go up. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gus.is.here at gmail.com Tue Dec 7 21:16:09 2010 From: gus.is.here at gmail.com (Gus Ishere) Date: Tue, 7 Dec 2010 21:16:09 -0500 Subject: [SciPy-User] g++ Compilation error using weave Message-ID: I'm trying to use a very simple code snippet in weave http://codepad.org/zAaKKVhG But I get the following error:?"error: cannot convert `float' to `PyObject*' in return" I pasted a more verbose?error in the link. I'd appreciate any light. Thanks, Gustavo From warren.weckesser at enthought.com Tue Dec 7 21:47:27 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 7 Dec 2010 20:47:27 -0600 Subject: [SciPy-User] g++ Compilation error using weave In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 8:16 PM, Gus Ishere wrote: > I'm trying to use a very simple code snippet in weave > http://codepad.org/zAaKKVhG > > But I get the following error: "error: cannot convert `float' to > `PyObject*' in return" > I pasted a more verbose error in the link. > > I'd appreciate any light. > > Hi Gustavo, Don't use a 'return' statement; instead, assign the return value to the variable 'return_val': import scipy.weave from scipy.weave import converters def a(): #weave for integration code=\ """ return_val = 1.0f; """ return scipy.weave.inline(code,[], type_converters=converters.blitz, compiler = 'gcc') print a() Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.walter at gmail.com Wed Dec 8 05:08:07 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Wed, 8 Dec 2010 11:08:07 +0100 Subject: [SciPy-User] Speeding up integrate.odeint with weave/blitz In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 11:48 PM, Rob Clewley wrote: > Gus, > > On Tue, Dec 7, 2010 at 4:26 PM, Gus Ishere wrote: >> I currently have a call like:?integrate.odeint(dX_dt,X0,t,full_output=True) >> One approach to speed up the integration time is to make dX_dt into a weave >> function. Is there a better way to speed it up and avoid the function call >> overhead in Python? Any examples would be very appreciated. >> dX_dt is very simple. > > There are different compromises in the different possible approaches, > I wouldn't say any is "best". One approach is to use my PyDSTool > package (pydstool.sourceforge.net), which I think will save you work > and provide powerful options in the long term but requires some extra > installation and learning curve for the more sophisticated user > interface. It will automatically compile the dX_dt function you > specify into C and link it with the C integrator. Does that mean that your tool translates Python to C? > So there is very > little overhead in interaction between your function and the > integrator, and once my code is installed this happens quite > transparently to the user. PyDSTool even gives you the opportunity to > hand-optimize the C code for your function before final compilation, > if that floats your speedboat. > > -- > Robert Clewley, Ph.D. > Assistant Professor > Neuroscience Institute and > Department of Mathematics and Statistics > Georgia State University > PO Box 5030 > Atlanta, GA 30302, USA > > tel: 404-413-6420 fax: 404-413-5446 > http://www2.gsu.edu/~matrhc > http://neuroscience.gsu.edu/rclewley.html > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ralf.gommers at googlemail.com Wed Dec 8 06:18:55 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 8 Dec 2010 19:18:55 +0800 Subject: [SciPy-User] Problems installing scipy on OS X 10.6 (Snow Leopard): libarpack In-Reply-To: References: Message-ID: On Wed, Dec 8, 2010 at 4:06 AM, Uri Laserson wrote: > Hi all, > > I am on a MacMini with Intel processor. I just installed OS X 10.6 and the > latest Xcode that I could download, which included gcc 4.2. I am using > python 2.7 built from source using homebrew. I installed the gfortran 4.2.3 > binaries from http://r.research.att.com/tools/. > > I am trying to install numpy and scipy. numpy installs fine with or > without switching to g++-4.0. I have successfully installed it using pip > and also directly from source from the git repository. > > Scipy is giving me errors on install (the same errors whether I use pip or > try the svn repository). I installed it successfully yesterday on a new > Macbook Air using pip, after changing the symlinks to point to g++-4.0. > However, today on my MacMini, I am getting errors after following the same > protocol. > > The errors I am getting are here: > https://gist.github.com/732293 > The error indicates that 32 and 64 bit binaries are being mixed. Can you tell us the following: - what build command you used - what Python you are using (from python.org, from Apple, self-compiled?) - the output of "gcc -v", "g++ -v" and "gfortran -v" Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From mathieu at mblondel.org Wed Dec 8 08:38:57 2010 From: mathieu at mblondel.org (Mathieu Blondel) Date: Wed, 8 Dec 2010 22:38:57 +0900 Subject: [SciPy-User] Ax = b for symmetric positive definite matrix Message-ID: Hi everyone, I want to solve equations of the form Ax = b or Ax = B where A is a dense symmetric positive definite matrix. I want to be able to support potentially large A so I find it convenient to store A as a 1d-vector containing the upper part of the matrix only. It's easy to convert this 1d representation to a 2d-representation, and vice-versa. I believe functions like linalg.solve (when passed sym_pos=True) and linalg.cholesky would benefit from accepting 1d-arrays of this kind. Is there a memory efficient way of solving my equations for a large A? Thanks, Mathieu From dagss at student.matnat.uio.no Wed Dec 8 08:47:57 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 08 Dec 2010 14:47:57 +0100 Subject: [SciPy-User] Ax = b for symmetric positive definite matrix In-Reply-To: References: Message-ID: <4CFF8C8D.9080602@student.matnat.uio.no> On 12/08/2010 02:38 PM, Mathieu Blondel wrote: > Hi everyone, > > I want to solve equations of the form Ax = b or Ax = B where A is a > dense symmetric positive definite matrix. I want to be able to support > potentially large A so I find it convenient to store A as a 1d-vector > containing the upper part of the matrix only. It's easy to convert > this 1d representation to a 2d-representation, and vice-versa. > > I believe functions like linalg.solve (when passed sym_pos=True) and > linalg.cholesky would benefit from accepting 1d-arrays of this kind. > > Is there a memory efficient way of solving my equations for a large A? > Perhaps not the answer you're looking for, but: It is not supported in SciPy yet, but if you want to play with calling LAPACK directly, this is supported through sppsv/dppsv/cppsv/zppsv. For instance you could modify the scipy/linalg/*.pyf files to include the function, rebuild SciPy. Or, search for "tokyo cython lapack" and play with that. http://www.netlib.org/lapack/double/dppsv.f Dag Sverre From josef.pktd at gmail.com Wed Dec 8 09:33:36 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Dec 2010 09:33:36 -0500 Subject: [SciPy-User] comparing statistical test in scipy.stats and R.stats, what do we have? Message-ID: I'm looking a bit at the status of "Statistics in Python", or "what do they have, and we don't". Here are raw tables of content of R.stats and scipy.stats for statistical tests. Some additional ones are in scikits.statsmodels, but several are missing. (It doesn't contain the latest additions to scipy.stats like fisherexact) Is anyone interested in adding some missing ones? (BSD compatible and hopefully verified or verifiable so that they don't have to linger for a year or two in the ticket queue) I will keep adding the ones I'm interested in to scikits.statsmodels. Josef from the scraped table of content of R library(stats) functions with "test" in name >>> pprint([item for item in statstoc if 'test' in item[0].lower()]) [['ansari.test', 'Ansari-Bradley Test '], ['bartlett.test', 'Bartlett Test of Homogeneity of Variances '], ['binom.test', 'Exact Binomial Test '], ['Box.test', 'Box-Pierce and Ljung-Box Tests '], ['chisq.test', "Pearson's Chi-squared Test for Count Data "], ['cor.test', 'Test for Association/Correlation Between Paired Samples '], ['fisher.test', "Fisher's Exact Test for Count Data "], ['fligner.test', 'Fligner-Killeen Test of Homogeneity of Variances '], ['friedman.test', 'Friedman Rank Sum Test '], ['kruskal.test', 'Kruskal-Wallis Rank Sum Test '], ['ks.test', 'Kolmogorov-Smirnov Tests '], ['mantelhaen.test', 'Cochran-Mantel-Haenszel Chi-Squared Test for Count Data '], ['mauchly.test', "Mauchly's Test of Sphericity "], ['mcnemar.test', "McNemar's Chi-squared Test for Count Data "], ['mood.test', 'Mood Two-Sample Test of Scale '], ['oneway.test', 'Test for Equal Means in a One-Way Layout '], ['pairwise.prop.test', 'Pairwise comparisons for proportions '], ['pairwise.t.test', 'Pairwise t tests '], ['pairwise.wilcox.test', 'Pairwise Wilcoxon rank sum tests '], ['poisson.test', 'Exact Poisson tests '], ['power.anova.test', 'Power calculations for balanced one-way analysis of variance tests '], ['power.prop.test', 'Power calculations two sample test for proportions '], ['power.t.test', 'Power calculations for one and two sample t tests '], ['PP.test', 'Phillips-Perron Test for Unit Roots '], ['print.power.htest', 'Print method for power calculation object '], ['prop.test', 'Test of Equal or Given Proportions '], ['prop.trend.test', 'Test for trend in proportions '], ['quade.test', 'Quade Test '], ['shapiro.test', 'Shapiro-Wilk Normality Test '], ['t.test', "Student's t-Test "], ['var.test', 'F Test to Compare Two Variances '], ['wilcox.test', 'Wilcoxon Rank Sum and Signed Rank Tests ']] in scipy.stats objects with "test" in docs >>> [item for item in dir(stats) if (getattr(stats, item).__doc__ and 'test' in getattr(stats, item).__doc__)] ['Tester', 'anderson', 'ansari', 'bartlett', 'binom_test', 'chisquare', 'f_oneway', 'fligner', 'friedmanchisquare', 'glm', 'kendalltau', 'kruskal', 'ks_2samp', 'ksone', 'ksprob', 'kstest', 'kstwobign', 'kurtosis', 'kurtosistest', 'levene', 'linregress', 'mannwhitneyu', 'mood', 'normaltest', 'obrientransform', 'oneway', 'pearsonr', 'percentileofscore', 'ranksums', 'rv_discrete', 'shapiro', 'skew', 'skewtest', 'spearmanr', 'statlib', 'stats', 'test', 'tiecorrect', 'ttest_1samp', 'ttest_ind', 'ttest_rel', 'wilcoxon'] Josef From alexander.borghgraef.rma at gmail.com Wed Dec 8 09:40:19 2010 From: alexander.borghgraef.rma at gmail.com (Alexander Borghgraef) Date: Wed, 8 Dec 2010 15:40:19 +0100 Subject: [SciPy-User] Installed scikits.learn, .pth file doesn't add egg to path Message-ID: Hi all, I've installed the latest scikits.learn package via easy_install locally into my home directory. The local libs are located in $HOME/local/libs/python2.6/site-packages, which I added to my PYTHONPATH variable. Now, easy_install puts the library into an egg within the site-packages, but adds an easy_install.pth file which should make it visible for import, so I shouldn't put every egg into my PYTHONPATH manually. However, "import scikits" doesn't work. It does work when I add the egg directory ( scikits.learn-0.5-py2.6-linux-x86_64.egg ) to the PYTHONPATH. I'm not familiar with .pth files, though they seem simple and at a first glance, it should work. The file contains the following code: > > >> import sys; sys.__plen = len(sys.path) > ./site-packages/scikits.learn-0.5-py2.6-linux-x86_64.egg > import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; > p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = > p+len(new) > I also fiddled a bit with it, removing the leading ./, or adding the full path of the egg, but no results. Any suggestions? -- Alex Borghgraef -------------- next part -------------- An HTML attachment was scrubbed... URL: From mathieu at mblondel.org Wed Dec 8 10:16:18 2010 From: mathieu at mblondel.org (Mathieu Blondel) Date: Thu, 9 Dec 2010 00:16:18 +0900 Subject: [SciPy-User] Ax = b for symmetric positive definite matrix In-Reply-To: <4CFF8C8D.9080602@student.matnat.uio.no> References: <4CFF8C8D.9080602@student.matnat.uio.no> Message-ID: On Wed, Dec 8, 2010 at 10:47 PM, Dag Sverre Seljebotn wrote: > It is not supported in SciPy yet, but if you want to play with calling > LAPACK directly, this is supported through sppsv/dppsv/cppsv/zppsv. For > instance you could modify the scipy/linalg/*.pyf files to include the > function, rebuild SciPy. Or, search for "tokyo cython lapack" and play > with that. Thanks! Ideally, I would like to include the relevant Fortran or C files directly in my project so that it is easy to compile for other users. In this regard, maybe CLAPACK + Cython seem like a better choice than LAPACK + f2py? Mathieu From nwagner at iam.uni-stuttgart.de Wed Dec 8 11:53:38 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 08 Dec 2010 17:53:38 +0100 Subject: [SciPy-User] Ax = b for symmetric positive definite matrix In-Reply-To: <4CFF8C8D.9080602@student.matnat.uio.no> References: <4CFF8C8D.9080602@student.matnat.uio.no> Message-ID: On Wed, 08 Dec 2010 14:47:57 +0100 Dag Sverre Seljebotn wrote: > On 12/08/2010 02:38 PM, Mathieu Blondel wrote: >> Hi everyone, >> >> I want to solve equations of the form Ax = b or Ax = B >>where A is a >> dense symmetric positive definite matrix. I want to be >>able to support >> potentially large A so I find it convenient to store A >>as a 1d-vector >> containing the upper part of the matrix only. It's easy >>to convert >> this 1d representation to a 2d-representation, and >>vice-versa. >> >> I believe functions like linalg.solve (when passed >>sym_pos=True) and >> linalg.cholesky would benefit from accepting 1d-arrays >>of this kind. >> >> Is there a memory efficient way of solving my equations >>for a large A? >> > > Perhaps not the answer you're looking for, but: > > It is not supported in SciPy yet, but if you want to >play with calling > LAPACK directly, this is supported through >sppsv/dppsv/cppsv/zppsv. For > instance you could modify the scipy/linalg/*.pyf files >to include the > function, rebuild SciPy. Or, search for "tokyo cython >lapack" and play > with that. > > http://www.netlib.org/lapack/double/dppsv.f > > Dag Sverre > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user I have filed a ticket http://projects.scipy.org/scipy/ticket/456 for that purpose. Nils From dagss at student.matnat.uio.no Wed Dec 8 13:03:39 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 08 Dec 2010 19:03:39 +0100 Subject: [SciPy-User] Ax = b for symmetric positive definite matrix In-Reply-To: References: <4CFF8C8D.9080602@student.matnat.uio.no> Message-ID: <4CFFC87B.3050308@student.matnat.uio.no> On 12/08/2010 04:16 PM, Mathieu Blondel wrote: > On Wed, Dec 8, 2010 at 10:47 PM, Dag Sverre Seljebotn > wrote: > > >> It is not supported in SciPy yet, but if you want to play with calling >> LAPACK directly, this is supported through sppsv/dppsv/cppsv/zppsv. For >> instance you could modify the scipy/linalg/*.pyf files to include the >> function, rebuild SciPy. Or, search for "tokyo cython lapack" and play >> with that. >> > Thanks! > > Ideally, I would like to include the relevant Fortran or C files > directly in my project so that it is easy to compile for other users. > In this regard, maybe CLAPACK + Cython seem like a better choice than > LAPACK + f2py? > To have any performance at all you *need* to link to whatever LAPACK the user is using (ATLAS, Intel MKL, etc.). The most pain-free way to do that currently is to patch/improve SciPy I believe. Dag Sverre From solomon.negusse at twdb.state.tx.us Wed Dec 8 15:18:02 2010 From: solomon.negusse at twdb.state.tx.us (Solomon M Negusse) Date: Wed, 8 Dec 2010 12:18:02 -0800 (PST) Subject: [SciPy-User] [SciPy-user] scipy/numpy on CentOS 5/ Dependency problem In-Reply-To: References: <261cc8ff0809270758p30f1f259r2481497402539c2d@mail.gmail.com> Message-ID: <30409488.post@talk.nabble.com> Hello there, I hit the same dependency problem while trying to install numpy and scipy in my linux box this morning. Can you please help with installing these packages. My search in google didn't come up with a useful result. Thank you, -Solomon Matthieu Brucher-2 wrote: > > Hi, > > Start by installing refblas3, refblas3-dev, lapack3 and lapack3-dev. > Last time I had this issue, installing the dependencies by hand first > solved it. > > Matthieu > > 2008/9/27 Matthias Blaicher : >> Hello, >> >> I want to install SciPy and numpy on a CentOS 5 system. It contains a >> fresh install with up-to-date packages. I use the "official" Suse >> Build repos by DavidCournapeau. >> >> It fails with a conflict between refblas3 and blas. >> >> [root at localhost yum.repos.d]# yum install python-numpy python-scipy >> Loading "fastestmirror" plugin >> Loading mirror speeds from cached hostfile >> * home_ashigabou: download.opensuse.org >> * extras: mirrors.tummy.com >> * updates: dds.gina.alaska.edu >> * base: mirror.hmc.edu >> * addons: mirrors.tummy.com >> extras 100% |=========================| 1.1 kB >> 00:00 >> updates 100% |=========================| 951 B >> 00:00 >> base 100% |=========================| 1.1 kB >> 00:00 >> addons 100% |=========================| 951 B >> 00:00 >> Setting up Install Process >> Parsing package install arguments >> Resolving Dependencies >> --> Running transaction check >> ---> Package python-scipy.i386 0:0.6.0-2.1 set to be updated >> --> Processing Dependency: libblas.so.3 for package: python-scipy >> --> Processing Dependency: libblas.so.3 for package: python-scipy >> --> Processing Dependency: liblapack.so.3 for package: python-scipy >> --> Processing Dependency: libgfortran.so.1 for package: python-scipy >> --> Processing Dependency: liblapack.so.3 for package: python-scipy >> ---> Package python-numpy.i386 0:1.2.0-1.1 set to be updated >> --> Processing Dependency: gcc-gfortran for package: python-numpy >> --> Processing Dependency: refblas3 for package: python-numpy >> --> Processing Dependency: lapack3 < 3.1 for package: python-numpy >> --> Running transaction check >> ---> Package gcc-gfortran.i386 0:4.1.2-42.el5 set to be updated >> --> Processing Dependency: gcc = 4.1.2-42.el5 for package: gcc-gfortran >> ---> Package blas.i386 0:3.0-37.el5 set to be updated >> ---> Package lapack3.i386 0:3.0-19.1 set to be updated >> ---> Package lapack.i386 0:3.0-37.el5 set to be updated >> ---> Package refblas3.i386 0:3.0-11.1 set to be updated >> ---> Package libgfortran.i386 0:4.1.2-42.el5 set to be updated >> --> Running transaction check >> ---> Package gcc.i386 0:4.1.2-42.el5 set to be updated >> --> Processing Dependency: libgomp.so.1 for package: gcc >> --> Processing Dependency: glibc-devel >= 2.2.90-12 for package: gcc >> --> Processing Dependency: libgomp = 4.1.2-42.el5 for package: gcc >> --> Running transaction check >> ---> Package glibc-devel.i386 0:2.5-24 set to be updated >> --> Processing Dependency: glibc-headers for package: glibc-devel >> --> Processing Dependency: glibc-headers = 2.5-24 for package: >> glibc-devel >> ---> Package libgomp.i386 0:4.1.2-42.el5 set to be updated >> --> Running transaction check >> ---> Package glibc-headers.i386 0:2.5-24 set to be updated >> --> Processing Dependency: kernel-headers for package: glibc-headers >> --> Processing Dependency: kernel-headers >= 2.2.1 for package: >> glibc-headers >> --> Running transaction check >> ---> Package kernel-headers.i386 0:2.6.18-92.1.13.el5 set to be updated >> --> Processing Conflict: refblas3 conflicts blas >> --> Finished Dependency Resolution >> Error: refblas3 conflicts with blas >> >> Is there anything I'm doing wrong here, as I don't want to compile all >> dependencies by myself.. >> >> Sincerly, >> >> Matthias Blaicher >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > > > > -- > French PhD student > Information System Engineer > Website: http://matthieu-brucher.developpez.com/ > Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn: http://www.linkedin.com/in/matthieubrucher > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/scipy-numpy-on-CentOS-5--Dependency-problem-tp19703443p30409488.html Sent from the Scipy-User mailing list archive at Nabble.com. From rob.clewley at gmail.com Wed Dec 8 16:07:35 2010 From: rob.clewley at gmail.com (Rob Clewley) Date: Wed, 8 Dec 2010 16:07:35 -0500 Subject: [SciPy-User] Speeding up integrate.odeint with weave/blitz In-Reply-To: References: Message-ID: >> It will automatically compile the dX_dt function you >> specify into C and link it with the C integrator. > > Does that mean that your tool translates Python to C? No, it translates an almost-subset of python, specified by strings or symbolic objects, to actual python or C, using declared variable and parameter names rather than an indexed array. The two non-python statements are an "if" function and a "for" macro. You can call other user-defined functions declared in a similar way. You can also drop in arbitrary code into the internally-created function before or after these strings are coded. And you can import arbitrary libraries for both python and C targets. -- Robert Clewley, Ph.D. Assistant Professor Neuroscience Institute and Department of Mathematics and Statistics Georgia State University PO Box 5030 Atlanta, GA 30302, USA tel: 404-413-6420 fax: 404-413-5446 http://www2.gsu.edu/~matrhc http://neuroscience.gsu.edu/rclewley.html From sebastian.walter at gmail.com Wed Dec 8 17:37:38 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Wed, 8 Dec 2010 23:37:38 +0100 Subject: [SciPy-User] Speeding up integrate.odeint with weave/blitz In-Reply-To: References: Message-ID: On Wed, Dec 8, 2010 at 10:07 PM, Rob Clewley wrote: >>> It will automatically compile the dX_dt function you >>> specify into C and link it with the C integrator. >> >> Does that mean that your tool translates Python to C? > > No, it translates an almost-subset of python, specified by strings or > symbolic objects, to actual python or C, using declared variable and > parameter names rather than an indexed array. The two non-python > statements are an "if" function and a "for" macro. Could you point me to some short code example where the "for" macro is used? > You can call other > user-defined functions declared in a similar way. You can also drop in > arbitrary code into the internally-created function before or after > these strings are coded. And you can import arbitrary libraries for > both python and C targets. > > -- > Robert Clewley, Ph.D. > Assistant Professor > Neuroscience Institute and > Department of Mathematics and Statistics > Georgia State University > PO Box 5030 > Atlanta, GA 30302, USA > > tel: 404-413-6420 fax: 404-413-5446 > http://www2.gsu.edu/~matrhc > http://neuroscience.gsu.edu/rclewley.html > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From oliphant at enthought.com Wed Dec 8 18:13:30 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Wed, 8 Dec 2010 17:13:30 -0600 Subject: [SciPy-User] comparing statistical test in scipy.stats and R.stats, what do we have? In-Reply-To: References: Message-ID: <5A0944AA-E790-47C6-92FA-EADC4367435B@enthought.com> It would be great to get these into scipy stats -- particularly the ones you are interested in :-) Travis -- (mobile phone of) Travis Oliphant Enthought, Inc. 1-512-536-1057 http://www.enthought.com On Dec 8, 2010, at 8:33 AM, josef.pktd at gmail.com wrote: > I'm looking a bit at the status of "Statistics in Python", or "what do > they have, and we don't". > > Here are raw tables of content of R.stats and scipy.stats for > statistical tests. Some additional ones are in scikits.statsmodels, > but several are missing. (It doesn't contain the latest additions to > scipy.stats like fisherexact) > > Is anyone interested in adding some missing ones? (BSD compatible and > hopefully verified or verifiable so that they don't have to linger for > a year or two in the ticket queue) > > I will keep adding the ones I'm interested in to scikits.statsmodels. > > Josef > > from the scraped table of content of R library(stats) functions with > "test" in name > >>>> pprint([item for item in statstoc if 'test' in item[0].lower()]) > [['ansari.test', 'Ansari-Bradley Test '], > ['bartlett.test', 'Bartlett Test of Homogeneity of Variances '], > ['binom.test', 'Exact Binomial Test '], > ['Box.test', 'Box-Pierce and Ljung-Box Tests '], > ['chisq.test', "Pearson's Chi-squared Test for Count Data "], > ['cor.test', 'Test for Association/Correlation Between Paired Samples '], > ['fisher.test', "Fisher's Exact Test for Count Data "], > ['fligner.test', 'Fligner-Killeen Test of Homogeneity of Variances '], > ['friedman.test', 'Friedman Rank Sum Test '], > ['kruskal.test', 'Kruskal-Wallis Rank Sum Test '], > ['ks.test', 'Kolmogorov-Smirnov Tests '], > ['mantelhaen.test', > 'Cochran-Mantel-Haenszel Chi-Squared Test for Count Data '], > ['mauchly.test', "Mauchly's Test of Sphericity "], > ['mcnemar.test', "McNemar's Chi-squared Test for Count Data "], > ['mood.test', 'Mood Two-Sample Test of Scale '], > ['oneway.test', 'Test for Equal Means in a One-Way Layout '], > ['pairwise.prop.test', 'Pairwise comparisons for proportions '], > ['pairwise.t.test', 'Pairwise t tests '], > ['pairwise.wilcox.test', 'Pairwise Wilcoxon rank sum tests '], > ['poisson.test', 'Exact Poisson tests '], > ['power.anova.test', > 'Power calculations for balanced one-way analysis of variance tests '], > ['power.prop.test', 'Power calculations two sample test for proportions '], > ['power.t.test', 'Power calculations for one and two sample t tests '], > ['PP.test', 'Phillips-Perron Test for Unit Roots '], > ['print.power.htest', 'Print method for power calculation object '], > ['prop.test', 'Test of Equal or Given Proportions '], > ['prop.trend.test', 'Test for trend in proportions '], > ['quade.test', 'Quade Test '], > ['shapiro.test', 'Shapiro-Wilk Normality Test '], > ['t.test', "Student's t-Test "], > ['var.test', 'F Test to Compare Two Variances '], > ['wilcox.test', 'Wilcoxon Rank Sum and Signed Rank Tests ']] > > > in scipy.stats objects with "test" in docs > >>>> [item for item in dir(stats) if (getattr(stats, item).__doc__ and 'test' in getattr(stats, item).__doc__)] > ['Tester', 'anderson', 'ansari', 'bartlett', 'binom_test', > 'chisquare', 'f_oneway', 'fligner', 'friedmanchisquare', 'glm', > 'kendalltau', 'kruskal', 'ks_2samp', 'ksone', 'ksprob', 'kstest', > 'kstwobign', 'kurtosis', 'kurtosistest', 'levene', 'linregress', > 'mannwhitneyu', 'mood', 'normaltest', 'obrientransform', 'oneway', > 'pearsonr', 'percentileofscore', 'ranksums', 'rv_discrete', 'shapiro', > 'skew', 'skewtest', 'spearmanr', 'statlib', 'stats', 'test', > 'tiecorrect', 'ttest_1samp', 'ttest_ind', 'ttest_rel', 'wilcoxon'] > > > Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Wed Dec 8 18:37:42 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Dec 2010 18:37:42 -0500 Subject: [SciPy-User] comparing statistical test in scipy.stats and R.stats, what do we have? In-Reply-To: <5A0944AA-E790-47C6-92FA-EADC4367435B@enthought.com> References: <5A0944AA-E790-47C6-92FA-EADC4367435B@enthought.com> Message-ID: On Wed, Dec 8, 2010 at 6:13 PM, Travis Oliphant wrote: > It would be great to get these into scipy stats -- particularly the ones you are interested in :-) Right now, most of the tests that I'm working on are in support of regression models, especially diagnostic and specification tests, so they require statsmodels. Josef > > Travis > > -- > (mobile phone of) > Travis Oliphant > Enthought, Inc. > 1-512-536-1057 > http://www.enthought.com > > On Dec 8, 2010, at 8:33 AM, josef.pktd at gmail.com wrote: > >> I'm looking a bit at the status of "Statistics in Python", or "what do >> they have, and we don't". >> >> Here are raw tables of content of R.stats and scipy.stats for >> statistical tests. Some additional ones are in scikits.statsmodels, >> but several are missing. (It doesn't contain the latest additions to >> scipy.stats like fisherexact) >> >> Is anyone interested in adding some missing ones? (BSD compatible and >> hopefully verified or verifiable so that they don't have to linger for >> a year or two in the ticket queue) >> >> I will keep adding the ones I'm interested in to scikits.statsmodels. >> >> Josef >> >> from the scraped table of content of R library(stats) ?functions with >> "test" in name >> >>>>> pprint([item for item in statstoc if 'test' in item[0].lower()]) >> [['ansari.test', 'Ansari-Bradley Test '], >> ['bartlett.test', 'Bartlett Test of Homogeneity of Variances '], >> ['binom.test', 'Exact Binomial Test '], >> ['Box.test', 'Box-Pierce and Ljung-Box Tests '], >> ['chisq.test', "Pearson's Chi-squared Test for Count Data "], >> ['cor.test', 'Test for Association/Correlation Between Paired Samples '], >> ['fisher.test', "Fisher's Exact Test for Count Data "], >> ['fligner.test', 'Fligner-Killeen Test of Homogeneity of Variances '], >> ['friedman.test', 'Friedman Rank Sum Test '], >> ['kruskal.test', 'Kruskal-Wallis Rank Sum Test '], >> ['ks.test', 'Kolmogorov-Smirnov Tests '], >> ['mantelhaen.test', >> ?'Cochran-Mantel-Haenszel Chi-Squared Test for Count Data '], >> ['mauchly.test', "Mauchly's Test of Sphericity "], >> ['mcnemar.test', "McNemar's Chi-squared Test for Count Data "], >> ['mood.test', 'Mood Two-Sample Test of Scale '], >> ['oneway.test', 'Test for Equal Means in a One-Way Layout '], >> ['pairwise.prop.test', 'Pairwise comparisons for proportions '], >> ['pairwise.t.test', 'Pairwise t tests '], >> ['pairwise.wilcox.test', 'Pairwise Wilcoxon rank sum tests '], >> ['poisson.test', 'Exact Poisson tests '], >> ['power.anova.test', >> ?'Power calculations for balanced one-way analysis of variance tests '], >> ['power.prop.test', 'Power calculations two sample test for proportions '], >> ['power.t.test', 'Power calculations for one and two sample t tests '], >> ['PP.test', 'Phillips-Perron Test for Unit Roots '], >> ['print.power.htest', 'Print method for power calculation object '], >> ['prop.test', 'Test of Equal or Given Proportions '], >> ['prop.trend.test', 'Test for trend in proportions '], >> ['quade.test', 'Quade Test '], >> ['shapiro.test', 'Shapiro-Wilk Normality Test '], >> ['t.test', "Student's t-Test "], >> ['var.test', 'F Test to Compare Two Variances '], >> ['wilcox.test', 'Wilcoxon Rank Sum and Signed Rank Tests ']] >> >> >> in scipy.stats objects with "test" in docs >> >>>>> [item for item in dir(stats) if (getattr(stats, item).__doc__ and 'test' in getattr(stats, item).__doc__)] >> ['Tester', 'anderson', 'ansari', 'bartlett', 'binom_test', >> 'chisquare', 'f_oneway', 'fligner', 'friedmanchisquare', 'glm', >> 'kendalltau', 'kruskal', 'ks_2samp', 'ksone', 'ksprob', 'kstest', >> 'kstwobign', 'kurtosis', 'kurtosistest', 'levene', 'linregress', >> 'mannwhitneyu', 'mood', 'normaltest', 'obrientransform', 'oneway', >> 'pearsonr', 'percentileofscore', 'ranksums', 'rv_discrete', 'shapiro', >> 'skew', 'skewtest', 'spearmanr', 'statlib', 'stats', 'test', >> 'tiecorrect', 'ttest_1samp', 'ttest_ind', 'ttest_rel', 'wilcoxon'] >> >> >> Josef >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jsseabold at gmail.com Wed Dec 8 20:08:43 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 8 Dec 2010 20:08:43 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Tue, Dec 7, 2010 at 7:37 PM, Charles R Harris wrote: > > > On Tue, Dec 7, 2010 at 5:09 PM, wrote: >> >> On Tue, Dec 7, 2010 at 6:45 PM, Charles R Harris >> wrote: >> > >> > >> > On Tue, Dec 7, 2010 at 10:39 AM, Skipper Seabold >> > wrote: >> >> >> >> On Tue, Dec 7, 2010 at 12:17 PM, Charles R Harris >> >> wrote: >> >> > >> >> > >> >> > On Tue, Dec 7, 2010 at 10:05 AM, wrote: >> >> >> >> >> >> >> >> >> It's still a linear filter, non-linear optimization comes in because >> >> >> the exact loglikelihood function for ARMA is non-linear in the >> >> >> coefficients. >> >> >> (There might be a way to calculate the derivative in the same loop, >> >> >> but that's a different issue.) >> >> >> >> >> > >> >> > The unscented Kalman filter is a better way to estimate the >> >> > covariance >> >> > of a >> >> > non-linear process, think of it as a better integrator. If the >> >> > propagation >> >> > is easy to compute, which seems to be the case here, it will probably >> >> > save >> >> > you some time. You might even be able to use the basic idea and skip >> >> > the >> >> > Kalman part altogether. >> >> > >> >> > My general aim here is to optimize the algorithm first before getting >> >> > caught >> >> > up in the details of matrix multiplication in c. Premature >> >> > optimization >> >> > and >> >> > all that. >> >> > >> >> >> >> Hmm I haven't seen this mentioned much in what I've been reading or >> >> the documentation on existing software for ARMA processes, so I never >> >> thought much about it. ?I will have a closer look. ?Well, google turns >> >> up this thread... >> >> >> > >> > I've started reading up a bit on what you are doing and the application >> > doesn't use extended Kalman filters, so the suggestion to use unscented >> > Kalman filters is irrelevant. Sorry about that ;) I'm still wading >> > through >> > the various statistical notation thickets to see if there might be a >> > better >> > form to use for the problem but I don't see one at the moment. >> >> There are faster ways to get the likelihood for a simple ARMA process >> than using a Kalman Filter. I think the main advantage and reason for >> the popularity of Kalman Filter for this is that it is easier to >> extend. So using too many tricks that are specific to the simple ARMA >> might take away much of the advantage of getting a fast Kalman Filter. >> >> I didn't read much of the details for the Kalman Filter for this, but >> that was my conclusion from the non-Kalman Filter literature. >> That's the idea. The advantages of the KF are that it's inherently structural and it's *very* general. The ARMA case was just a jumping off point, but has also proved to be a sticking point. I'd like to have a fast and general linear Gaussian KF available for larger state space models, as it's the baseline workhorse for estimating linearized large macroeconomic models at the moment. > > Well, there are five forms of the standard Kalman filter that I am somewhat > familiar with and some are better suited to some applications than others. > But at this point I don't see that there is any reason not to use the common > form for the ARMA case. It would be interesting to see some profiling since > the matrix inversions are likely to dominate as the number of variables go > up. > If interested, I am using Durbin and Koopman's "Time Series Analysis by State Space Methods" and Andrew Harvey's "Forecasting, Structural Time Series Models, and the Kalman Filter" as my main references for this. The former is nice and concise but has a lot of details, suggestions, and use cases. I have looked some more and it does seem that the filter converges to its steady state after maybe 2 % of the iterations depending on the properties of the series, so for the ARMA case I can switch to the fast recursions only updating the state (not quite sure on the time savings yet), but I am moving away from my goal of a fast and general KF implementation... About the matrix inversions, the ARMA model right now is only univariate, so there is no real inverting of matrices. The suggestion of Durbin and Koopman for larger, multivariate cases is to split it into a series of univariate problems in order to avoid inversion. They provide in the book some benchmarks on computational efficiency in terms of multiplications needed based on their experience writing http://www.ssfpack.com/index.html. Skipper From jsseabold at gmail.com Wed Dec 8 23:15:55 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 8 Dec 2010 23:15:55 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Wed, Dec 8, 2010 at 8:08 PM, Skipper Seabold wrote: > On Tue, Dec 7, 2010 at 7:37 PM, Charles R Harris > wrote: >> >> >> On Tue, Dec 7, 2010 at 5:09 PM, wrote: >>> >>> On Tue, Dec 7, 2010 at 6:45 PM, Charles R Harris >>> wrote: >>> > >>> > >>> > On Tue, Dec 7, 2010 at 10:39 AM, Skipper Seabold >>> > wrote: >>> >> >>> >> On Tue, Dec 7, 2010 at 12:17 PM, Charles R Harris >>> >> wrote: >>> >> > >>> >> > >>> >> > On Tue, Dec 7, 2010 at 10:05 AM, wrote: >>> >> >>> >> >>> >> >>> >> >> It's still a linear filter, non-linear optimization comes in because >>> >> >> the exact loglikelihood function for ARMA is non-linear in the >>> >> >> coefficients. >>> >> >> (There might be a way to calculate the derivative in the same loop, >>> >> >> but that's a different issue.) >>> >> >> >>> >> > >>> >> > The unscented Kalman filter is a better way to estimate the >>> >> > covariance >>> >> > of a >>> >> > non-linear process, think of it as a better integrator. If the >>> >> > propagation >>> >> > is easy to compute, which seems to be the case here, it will probably >>> >> > save >>> >> > you some time. You might even be able to use the basic idea and skip >>> >> > the >>> >> > Kalman part altogether. >>> >> > >>> >> > My general aim here is to optimize the algorithm first before getting >>> >> > caught >>> >> > up in the details of matrix multiplication in c. Premature >>> >> > optimization >>> >> > and >>> >> > all that. >>> >> > >>> >> >>> >> Hmm I haven't seen this mentioned much in what I've been reading or >>> >> the documentation on existing software for ARMA processes, so I never >>> >> thought much about it. ?I will have a closer look. ?Well, google turns >>> >> up this thread... >>> >> >>> > >>> > I've started reading up a bit on what you are doing and the application >>> > doesn't use extended Kalman filters, so the suggestion to use unscented >>> > Kalman filters is irrelevant. Sorry about that ;) I'm still wading >>> > through >>> > the various statistical notation thickets to see if there might be a >>> > better >>> > form to use for the problem but I don't see one at the moment. >>> >>> There are faster ways to get the likelihood for a simple ARMA process >>> than using a Kalman Filter. I think the main advantage and reason for >>> the popularity of Kalman Filter for this is that it is easier to >>> extend. So using too many tricks that are specific to the simple ARMA >>> might take away much of the advantage of getting a fast Kalman Filter. >>> >>> I didn't read much of the details for the Kalman Filter for this, but >>> that was my conclusion from the non-Kalman Filter literature. >>> > > That's the idea. ?The advantages of the KF are that it's inherently > structural and it's *very* general. ?The ARMA case was just a jumping > off point, but has also proved to be a sticking point. ?I'd like to > have a fast and general linear Gaussian KF available for larger state > space models, as it's the baseline workhorse for estimating linearized > large macroeconomic models at the moment. > >> >> Well, there are five forms of the standard Kalman filter that I am somewhat >> familiar with and some are better suited to some applications than others. >> But at this point I don't see that there is any reason not to use the common >> form for the ARMA case. It would be interesting to see some profiling since >> the matrix inversions are likely to dominate as the number of variables go >> up. >> > > If interested, I am using Durbin and Koopman's "Time Series Analysis > by State Space Methods" and Andrew Harvey's "Forecasting, Structural > Time Series Models, and the Kalman Filter" as my main references for > this. ?The former is nice and concise but has a lot of details, > suggestions, and use cases. > > I have looked some more and it does seem that the filter converges to > its steady state after maybe 2 % of the iterations depending on the > properties of the series, so for the ARMA case I can switch to the > fast recursions only updating the state (not quite sure on the time > savings yet), but I am moving away from my goal of a fast and general > KF implementation... > > About the matrix inversions, the ARMA model right now is only > univariate, so there is no real inverting of matrices. ?The suggestion > of Durbin and Koopman for larger, multivariate cases is to split it > into a series of univariate problems in order to avoid inversion. > They provide in the book some benchmarks on computational efficiency > in terms of multiplications needed based on their experience writing > http://www.ssfpack.com/index.html. > > Skipper > It looks like I don't save too much time with just Python/scipy optimizations. Apparently ~75% of the time is spent in l-bfgs-b, judging by its user time output and the profiler's CPU time output(?). Non-cython versions: Brief and rough profiling on my laptop for ARMA(2,2) with 1000 observations. Optimization uses fmin_l_bfgs_b with m = 12 and iprint = 0. Full Kalman Filter, starting parameters found via iterations in Python ----------------------------------------------------------------------------------------------------- 1696041 function calls (1695957 primitive calls) in 7.622 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 114 3.226 0.028 5.123 0.045 kalmanf.py:504(loglike) 1368384 1.872 0.000 1.872 0.000 {numpy.core._dotblas.dot} 102 1.736 0.017 2.395 0.023 arima.py:196(loglike_css) 203694 0.622 0.000 0.622 0.000 {sum} 90 0.023 0.000 0.023 0.000 {numpy.linalg.lapack_lite.dgesdd} 218 0.020 0.000 0.024 0.000 arima.py:117(_transparams) 46 0.015 0.000 0.016 0.000 function_base.py:494(asarray_chkfinite) 1208 0.013 0.000 0.013 0.000 {numpy.core.multiarray.array} 102163 0.012 0.000 0.012 0.000 {method 'append' of 'list' objects} 46 0.010 0.000 0.028 0.001 decomp_svd.py:12(svd) Full Kalman Filter, starting parameters found with scipy.signal.lfilter -------------------------------------------------------------------------------------------------- 1249493 function calls (1249409 primitive calls) in 4.596 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 102 2.862 0.028 4.437 0.043 kalmanf.py:504(loglike) 1224360 1.556 0.000 1.556 0.000 {numpy.core._dotblas.dot} 270 0.029 0.000 0.029 0.000 {sum} 90 0.025 0.000 0.025 0.000 {numpy.linalg.lapack_lite.dgesdd} 194 0.018 0.000 0.021 0.000 arima.py:117(_transparams) 46 0.016 0.000 0.017 0.000 function_base.py:494(asarray_chkfinite) 46 0.011 0.000 0.029 0.001 decomp_svd.py:12(svd) Kalman Filter with fast recursions, starting parameters with lfilter --------------------------------------------------------------------------------------------- 1097454 function calls (1097370 primitive calls) in 4.465 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 90 2.860 0.032 4.305 0.048 kalmanf.py:504(loglike) 1073757 1.431 0.000 1.431 0.000 {numpy.core._dotblas.dot} 270 0.029 0.000 0.029 0.000 {sum} 90 0.025 0.000 0.025 0.000 {numpy.linalg.lapack_lite.dgesdd} 182 0.016 0.000 0.019 0.000 arima.py:117(_transparams) 46 0.016 0.000 0.018 0.000 function_base.py:494(asarray_chkfinite) 46 0.011 0.000 0.030 0.001 decomp_svd.py:12(svd) Skipper From josef.pktd at gmail.com Wed Dec 8 23:28:13 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Dec 2010 23:28:13 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: > > It looks like I don't save too much time with just Python/scipy > optimizations. ?Apparently ~75% of the time is spent in l-bfgs-b, > judging by its user time output and the profiler's CPU time output(?). > ?Non-cython versions: > > Brief and rough profiling on my laptop for ARMA(2,2) with 1000 > observations. ?Optimization uses fmin_l_bfgs_b with m = 12 and iprint > = 0. Completely different idea: How costly are the numerical derivatives in l-bfgs-b? With l-bfgs-b, you should be able to replace the derivatives with the complex step derivatives that calculate the loglike function value and the derivatives in one iteration. Josef From jsseabold at gmail.com Thu Dec 9 16:33:41 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 9 Dec 2010 16:33:41 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Wed, Dec 8, 2010 at 11:28 PM, wrote: >> >> It looks like I don't save too much time with just Python/scipy >> optimizations. ?Apparently ~75% of the time is spent in l-bfgs-b, >> judging by its user time output and the profiler's CPU time output(?). >> ?Non-cython versions: >> >> Brief and rough profiling on my laptop for ARMA(2,2) with 1000 >> observations. ?Optimization uses fmin_l_bfgs_b with m = 12 and iprint >> = 0. > > Completely different idea: How costly are the numerical derivatives in l-bfgs-b? > With l-bfgs-b, you should be able to replace the derivatives with the > complex step derivatives that calculate the loglike function value and > the derivatives in one iteration. > I couldn't figure out how to use it without some hacks. The fmin_l_bfgs_b will call both f and fprime as (x, *args), but approx_fprime or approx_fprime_cs need actually approx_fprime(x, func, args=args) and call func(x, *args). I changed fmin_l_bfgs_b to make the call like this for the gradient, and I get (different computer) Using approx_fprime_cs ----------------------------------- 861609 function calls (861525 primitive calls) in 3.337 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 70 1.942 0.028 3.213 0.046 kalmanf.py:504(loglike) 840296 1.229 0.000 1.229 0.000 {numpy.core._dotblas.dot} 56 0.038 0.001 0.038 0.001 {numpy.linalg.lapack_lite.zgesv} 270 0.025 0.000 0.025 0.000 {sum} 90 0.019 0.000 0.019 0.000 {numpy.linalg.lapack_lite.dgesdd} 46 0.013 0.000 0.014 0.000 function_base.py:494(asarray_chkfinite) 162 0.012 0.000 0.014 0.000 arima.py:117(_transparams) Using approx_grad = True --------------------------------------- 1097454 function calls (1097370 primitive calls) in 3.615 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 90 2.316 0.026 3.489 0.039 kalmanf.py:504(loglike) 1073757 1.164 0.000 1.164 0.000 {numpy.core._dotblas.dot} 270 0.025 0.000 0.025 0.000 {sum} 90 0.020 0.000 0.020 0.000 {numpy.linalg.lapack_lite.dgesdd} 182 0.014 0.000 0.016 0.000 arima.py:117(_transparams) 46 0.013 0.000 0.014 0.000 function_base.py:494(asarray_chkfinite) 46 0.008 0.000 0.023 0.000 decomp_svd.py:12(svd) 23 0.004 0.000 0.004 0.000 {method 'var' of 'numpy.ndarray' objects} Definitely less function calls and a little faster, but I had to write some hacks to get it to work. Skipper From jsseabold at gmail.com Thu Dec 9 17:01:55 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 9 Dec 2010 17:01:55 -0500 Subject: [SciPy-User] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: On Thu, Dec 9, 2010 at 4:33 PM, Skipper Seabold wrote: > On Wed, Dec 8, 2010 at 11:28 PM, ? wrote: >>> >>> It looks like I don't save too much time with just Python/scipy >>> optimizations. ?Apparently ~75% of the time is spent in l-bfgs-b, >>> judging by its user time output and the profiler's CPU time output(?). >>> ?Non-cython versions: >>> >>> Brief and rough profiling on my laptop for ARMA(2,2) with 1000 >>> observations. ?Optimization uses fmin_l_bfgs_b with m = 12 and iprint >>> = 0. >> >> Completely different idea: How costly are the numerical derivatives in l-bfgs-b? >> With l-bfgs-b, you should be able to replace the derivatives with the >> complex step derivatives that calculate the loglike function value and >> the derivatives in one iteration. >> > > I couldn't figure out how to use it without some hacks. ?The > fmin_l_bfgs_b will call both f and fprime as (x, *args), but > approx_fprime or approx_fprime_cs need actually approx_fprime(x, func, > args=args) and call func(x, *args). ?I changed fmin_l_bfgs_b to make > the call like this for the gradient, and I get (different computer) > > > Using approx_fprime_cs > ----------------------------------- > ? ? ? ? 861609 function calls (861525 primitive calls) in 3.337 CPU seconds > > ? Ordered by: internal time > > ? ncalls ?tottime ?percall ?cumtime ?percall filename:lineno(function) > ? ? ? 70 ? ?1.942 ? ?0.028 ? ?3.213 ? ?0.046 kalmanf.py:504(loglike) > ? 840296 ? ?1.229 ? ?0.000 ? ?1.229 ? ?0.000 {numpy.core._dotblas.dot} > ? ? ? 56 ? ?0.038 ? ?0.001 ? ?0.038 ? ?0.001 {numpy.linalg.lapack_lite.zgesv} > ? ? ?270 ? ?0.025 ? ?0.000 ? ?0.025 ? ?0.000 {sum} > ? ? ? 90 ? ?0.019 ? ?0.000 ? ?0.019 ? ?0.000 {numpy.linalg.lapack_lite.dgesdd} > ? ? ? 46 ? ?0.013 ? ?0.000 ? ?0.014 ? ?0.000 > function_base.py:494(asarray_chkfinite) > ? ? ?162 ? ?0.012 ? ?0.000 ? ?0.014 ? ?0.000 arima.py:117(_transparams) > > > Using approx_grad = True > --------------------------------------- > ? ? ? ? 1097454 function calls (1097370 primitive calls) in 3.615 CPU seconds > > ? Ordered by: internal time > > ? ncalls ?tottime ?percall ?cumtime ?percall filename:lineno(function) > ? ? ? 90 ? ?2.316 ? ?0.026 ? ?3.489 ? ?0.039 kalmanf.py:504(loglike) > ?1073757 ? ?1.164 ? ?0.000 ? ?1.164 ? ?0.000 {numpy.core._dotblas.dot} > ? ? ?270 ? ?0.025 ? ?0.000 ? ?0.025 ? ?0.000 {sum} > ? ? ? 90 ? ?0.020 ? ?0.000 ? ?0.020 ? ?0.000 {numpy.linalg.lapack_lite.dgesdd} > ? ? ?182 ? ?0.014 ? ?0.000 ? ?0.016 ? ?0.000 arima.py:117(_transparams) > ? ? ? 46 ? ?0.013 ? ?0.000 ? ?0.014 ? ?0.000 > function_base.py:494(asarray_chkfinite) > ? ? ? 46 ? ?0.008 ? ?0.000 ? ?0.023 ? ?0.000 decomp_svd.py:12(svd) > ? ? ? 23 ? ?0.004 ? ?0.000 ? ?0.004 ? ?0.000 {method 'var' of > 'numpy.ndarray' objects} > > > Definitely less function calls and a little faster, but I had to write > some hacks to get it to work. > This is more like it! With fast recursions in Cython: 15186 function calls (15102 primitive calls) in 0.750 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 18 0.622 0.035 0.625 0.035 kalman_loglike.pyx:15(kalman_loglike) 270 0.024 0.000 0.024 0.000 {sum} 90 0.019 0.000 0.019 0.000 {numpy.linalg.lapack_lite.dgesdd} 156 0.013 0.000 0.013 0.000 {numpy.core._dotblas.dot} 46 0.013 0.000 0.014 0.000 function_base.py:494(asarray_chkfinite) 110 0.008 0.000 0.010 0.000 arima.py:118(_transparams) 46 0.008 0.000 0.023 0.000 decomp_svd.py:12(svd) 23 0.004 0.000 0.004 0.000 {method 'var' of 'numpy.ndarray' objects} 26 0.004 0.000 0.004 0.000 tsatools.py:109(lagmat) 90 0.004 0.000 0.042 0.000 arima.py:197(loglike_css) 81 0.004 0.000 0.004 0.000 {numpy.core.multiarray._fastCopyAndTranspose} I can live with this for now. Skipper From raultron at gmail.com Thu Dec 9 20:42:28 2010 From: raultron at gmail.com (=?ISO-8859-1?Q?Raul_Acu=F1a?=) Date: Thu, 9 Dec 2010 21:12:28 -0430 Subject: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric Matrix. Message-ID: Hi, Am using the iterative methods of scipy.sparse.linalg for solving a linear system of equations Ax = b. My matrix A is non symmetric. I've been using the scipy.sparse.linalg.cg() function multiplying both matrix "A" and "b" with the transpose of A so the matrix will become symmetric: Asym = matrix(dot(A.T,A)) bsym = matrix(dot(A.T,b)) sol = cg(A,b,tol = 1e-10,maxiter=30) Also i've been reading about the biconjugate gradient method, and if i am not mistaken the literature says that this method works on non-symmetric matrix, but when i try to use scipy.sparse.linalg.bicg() it wont work: sol = bicg(A,b,tol = 1e-10,maxiter=30) File "C:\Python26\lib\site-packages\scipy\sparse\linalg\isolve\iterative.py", line 74, in bicg A,M,x,b,postprocess = make_system(A,M,x0,b,xtype) File "C:\Python26\lib\site-packages\scipy\sparse\linalg\isolve\utils.py", line 65, in make_system raise ValueError('expected square matrix (shape=%s)' % shape) NameError: global name 'shape' is not defined Any help will be greatly appreciated, am comparing this methods for my data with an important emphasis on speed for my master thesis, so any discrepancy with the theory would be a great problem for me. Thanks in advance, Ra?l Acu?a. -------------- next part -------------- An HTML attachment was scrubbed... URL: From raultron at gmail.com Thu Dec 9 20:49:54 2010 From: raultron at gmail.com (=?ISO-8859-1?Q?Raul_Acu=F1a?=) Date: Thu, 9 Dec 2010 21:19:54 -0430 Subject: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric Matrix. In-Reply-To: References: Message-ID: I made a typing error in the previous post, the first code segment is this one instead: Asym = matrix(dot(A.T,A)) bsym = matrix(dot(A.T,b)) sol = cg(Asym,bsym,tol = 1e-10,maxiter=30) On Thu, Dec 9, 2010 at 9:12 PM, Raul Acu?a wrote: > Hi, > > Am using the iterative methods of scipy.sparse.linalg for solving a linear > system of equations Ax = b. My matrix A is non symmetric. I've been using > the scipy.sparse.linalg.cg() function multiplying both matrix "A" and "b" > with the transpose of A so the matrix will become symmetric: > > Asym = matrix(dot(A.T,A)) > bsym = matrix(dot(A.T,b)) > sol = cg(A,b,tol = 1e-10,maxiter=30) > > Also i've been reading about the biconjugate gradient method, and if i am > not mistaken the literature says that this method works on > non-symmetric matrix, but when i try to use scipy.sparse.linalg.bicg() it > wont work: > > sol = bicg(A,b,tol = 1e-10,maxiter=30) > > File > "C:\Python26\lib\site-packages\scipy\sparse\linalg\isolve\iterative.py", > line 74, in bicg > A,M,x,b,postprocess = make_system(A,M,x0,b,xtype) > File > "C:\Python26\lib\site-packages\scipy\sparse\linalg\isolve\utils.py", line > 65, in make_system > raise ValueError('expected square matrix (shape=%s)' % shape) > NameError: global name 'shape' is not defined > > Any help will be greatly appreciated, am comparing this methods for my data > with an important emphasis on speed for my master thesis, so any discrepancy > with the theory would be a great problem for me. > > Thanks in advance, > > > Ra?l Acu?a. > -- Ing. Ra?l Acu?a Profesor @Universidad Sim?n Bol?var Grupo de Mecatr?nica Departamento de Electr?nica y Circuitos Tel. +58-212-4121983 / Cel +58-412-5840317 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Dec 10 07:51:13 2010 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 10 Dec 2010 12:51:13 +0000 (UTC) Subject: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric Matrix. References: Message-ID: Thu, 09 Dec 2010 21:12:28 -0430, Raul Acu?a wrote: > Am using the iterative methods of scipy.sparse.linalg for solving a > linear system of equations Ax = b. My matrix A is non symmetric. I've > been using the scipy.sparse.linalg.cg() function multiplying both matrix > "A" and "b" with the transpose of A so the matrix will become symmetric: > > Asym = matrix(dot(A.T,A)) > bsym = matrix(dot(A.T,b)) > sol = cg(A,b,tol = 1e-10,maxiter=30) [clip] > sol = bicg(A,b,tol = 1e-10,maxiter=30) [clip] > raise ValueError('expected square matrix (shape=%s)' % shape) [clip] The error is saying that your matrix is not a square matrix. For a non-square matrix A, (A.T*A) is a square matrix, so that's why the CG algorithm works. So it's not about symmetry of the matrix. You should check that you have the same number of equations as unknowns. Otherwise the solution either does not exists or is not unique. -- Pauli Virtanen From dagss at student.matnat.uio.no Fri Dec 10 07:57:25 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 10 Dec 2010 13:57:25 +0100 Subject: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric Matrix. In-Reply-To: References: Message-ID: <4D0223B5.5070104@student.matnat.uio.no> On 12/10/2010 01:51 PM, Pauli Virtanen wrote: > Thu, 09 Dec 2010 21:12:28 -0430, Raul Acu?a wrote: > >> Am using the iterative methods of scipy.sparse.linalg for solving a >> linear system of equations Ax = b. My matrix A is non symmetric. I've >> been using the scipy.sparse.linalg.cg() function multiplying both matrix >> "A" and "b" with the transpose of A so the matrix will become symmetric: >> >> Asym = matrix(dot(A.T,A)) >> bsym = matrix(dot(A.T,b)) >> sol = cg(A,b,tol = 1e-10,maxiter=30) >> > [clip] > >> sol = bicg(A,b,tol = 1e-10,maxiter=30) >> > [clip] > >> raise ValueError('expected square matrix (shape=%s)' % shape) >> > [clip] > > The error is saying that your matrix is not a square matrix. > For a non-square matrix A, (A.T*A) is a square matrix, so that's why the > CG algorithm works. So it's not about symmetry of the matrix. > > You should check that you have the same number of equations as unknowns. > Otherwise the solution either does not exists or is not unique. > Note the NameError in the original post as well though -- that is, there may be a bug in the exception raising code as well. Dag Sverre From pav at iki.fi Fri Dec 10 08:30:10 2010 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 10 Dec 2010 13:30:10 +0000 (UTC) Subject: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric Matrix. References: <4D0223B5.5070104@student.matnat.uio.no> Message-ID: Fri, 10 Dec 2010 13:57:25 +0100, Dag Sverre Seljebotn wrote: [clip] > Note the NameError in the original post as well though -- that is, there > may be a bug in the exception raising code as well. Seems to have been fixed in r6780 From raultron at gmail.com Fri Dec 10 10:01:39 2010 From: raultron at gmail.com (=?ISO-8859-1?Q?Raul_Acu=F1a?=) Date: Fri, 10 Dec 2010 10:31:39 -0430 Subject: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric Matrix. In-Reply-To: References: <4D0223B5.5070104@student.matnat.uio.no> Message-ID: Thank you all, i saw my mistake. On Fri, Dec 10, 2010 at 9:00 AM, Pauli Virtanen wrote: > Fri, 10 Dec 2010 13:57:25 +0100, Dag Sverre Seljebotn wrote: > [clip] > > Note the NameError in the original post as well though -- that is, there > > may be a bug in the exception raising code as well. > > Seems to have been fixed in r6780 > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Ing. Ra?l Acu?a Profesor @Universidad Sim?n Bol?var Grupo de Mecatr?nica Departamento de Electr?nica y Circuitos Tel. +58-212-4121983 / Cel +58-412-5840317 -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.clewley at gmail.com Fri Dec 10 17:58:25 2010 From: rob.clewley at gmail.com (Rob Clewley) Date: Fri, 10 Dec 2010 17:58:25 -0500 Subject: [SciPy-User] Speeding up integrate.odeint with weave/blitz In-Reply-To: References: Message-ID: Sebastian, > Could you point me to some short code example where the "for" macro is used? Actually I don't use it often myself. The syntax for defining variables x_a through x_b (included) using dummy index "i" is: for(i, a, b, expr_in_i) where expr_in_i contains a mathematical definition for x_i and involving any parameters defined, and variables, including x_a through x_b, referred to using the syntax x[f(i)], where f(i) is an integer arithmetical function of i. The for macro then creates a sequence of expressions where each occurrence of `[f(i)]` is replaced with the appropriate integer in square brackets. E.g. a specification of variables x1, x2 and x3 coupled to their neighbours in a ring could look like specs = {'x[i]': 'for(i, 1, 3, x[i-1] + 2*x[i])', 'x0': 'x2 + 2*x0'} This required the special end-case definition for x0, that wraps back to x2, in order to be valid. When parsed this will create a right-hand side dX_dt function containing individual assignments for each x0, x1, x2, and x3. For instance, the one for x1 will be an encoding of 'x0 + 2*x1'. This works in essentially the same way as specifications in the older program XPP (X-PhasePlane), for anyone familiar with that package's syntax. Hope that's what you wanted to know. BTW, this syntax structure just enjoyed a small feature enhancement in the most recent update of PyDSTool posted on sourceforge, thanks to your interest! -- Robert Clewley, Ph.D. Assistant Professor Neuroscience Institute and Department of Mathematics and Statistics Georgia State University PO Box 5030 Atlanta, GA 30302, USA tel: 404-413-6420 fax: 404-413-5446 http://www2.gsu.edu/~matrhc http://neuroscience.gsu.edu/rclewley.html From dominique.orban at gmail.com Fri Dec 10 20:35:23 2010 From: dominique.orban at gmail.com (Dominique Orban) Date: Fri, 10 Dec 2010 20:35:23 -0500 Subject: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric Matrix. Message-ID: > ---------- Forwarded message ---------- > From: "Raul Acu?a" > To: scipy-user at scipy.org > Date: Thu, 9 Dec 2010 21:19:54 -0430 > Subject: Re: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric Matrix. > I made a typing error in the previous post, the first code segment is this one instead: > Asym = matrix(dot(A.T,A)) > bsym = matrix(dot(A.T,b)) > sol = cg(Asym,bsym,tol = 1e-10,maxiter=30) > > On Thu, Dec 9, 2010 at 9:12 PM, Raul Acu?a wrote: >> >> Hi, >> Am using the iterative methods of scipy.sparse.linalg for solving a linear system of equations Ax = b. My matrix A is non symmetric. I've been using the scipy.sparse.linalg.cg() function multiplying both matrix "A" and "b" with the transpose of A so the matrix will become symmetric: >> Asym = matrix(dot(A.T,A)) >> bsym = matrix(dot(A.T,b)) >> sol = cg(A,b,tol = 1e-10,maxiter=30) >> Also i've been reading about the biconjugate gradient method, and if i am not mistaken the literature says that this method works on non-symmetric matrix, but when i try to use scipy.sparse.linalg.bicg() it wont work: >> sol = bicg(A,b,tol = 1e-10,maxiter=30) >> File "C:\Python26\lib\site-packages\scipy\sparse\linalg\isolve\iterative.py", line 74, in bicg >> A,M,x,b,postprocess = make_system(A,M,x0,b,xtype) >> File "C:\Python26\lib\site-packages\scipy\sparse\linalg\isolve\utils.py", line 65, in make_system >> raise ValueError('expected square matrix (shape=%s)' % shape) >> NameError: global name 'shape' is not defined >> Any help will be greatly appreciated, am comparing this methods for my data with an important emphasis on speed for my master thesis, so any discrepancy with the theory would be a great problem for me. >> Thanks in advance, >> >> Ra?l Acu?a. Hi Ra?l, You can't use Bi-CG or Bi-CGSTAB to solve non-square systems directly. Your system is either under- or over-determined. What you may be meaning to solve here is a related linear least-squares problem. If A has more rows than columns, you have an over-determined system (more equations than unknowns) and a relevant problem is to minimize 1/2 * ||Ax - b||^2. Conversely, if A has more columns than rows, you have an under-determined system (more unknowns than conditions imposed on them) and you may want to minimize 1/2 * ||x||^2 subject to Ax=b i.e., find the least-norm solution among the infinitely many possibilities. In both cases, you can use MINRES, available in SciPy I think, or else in PyKrylov (https://github.com/dpo/pykrylov) and in NLPy (http://nlpy.sf.net) by solving the larger system [ I A ] [ r ] = [ b ] [ A.T 0 ] [ x ] [ 0 ] (for the first problem) or [ I A.T ] [ x ] = [ 0 ] [ A 0 ] [ r ] [ b ] (for the second problem). Though the coefficient matrix above is symmetric, it is indefinite, so you can't use CG! To use MINRES, you just need to write a function which computes the product of a vector with the coefficient matrix. In the first case, you can also use LSQR (available in NLPy and also probably in SciPy). In this case, you'll just need to be able to do A*x and A.T*y. -- Dominique -------------- next part -------------- An HTML attachment was scrubbed... URL: From raultron at gmail.com Fri Dec 10 21:40:31 2010 From: raultron at gmail.com (=?ISO-8859-1?Q?Raul_Acu=F1a?=) Date: Fri, 10 Dec 2010 22:10:31 -0430 Subject: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric Matrix. In-Reply-To: References: Message-ID: Hi Dominique, Thank you for your reply it was very helpful. I definitely have an over-determined system, I need to find the solution in the least amount of time possible, i've used the SVD method and now I'm experimenting with the iterative methods, someone told me that they are faster. I used the Conjugate Gradient function from SciPy using a trick to convert the original system: Ax = b to a system with a square matrix: dot(A.T,A)x = dot(A.T,b). It is working and it finds a good solution faster than the SVD method. However I dont know if this is the ideal method. Now, looking at your reply, and to your advice of using MINRES. I read that this method guaranteed convergence, but in terms of speed is MINRES a better method?. Also, sorry about my ignorance but i didnt understand this: [ I A ] [ r ] = [ b ] [ A.T 0 ] [ x ] [ 0 ] What is " r " in that system?. I want to try this method with my system using the function from scipy and also try the one in PyKrylov. Thanks again, best regards, Ra?l On Fri, Dec 10, 2010 at 9:05 PM, Dominique Orban wrote: > > ---------- Forwarded message ---------- > > From: "Raul Acu?a" > > To: scipy-user at scipy.org > > Date: Thu, 9 Dec 2010 21:19:54 -0430 > > Subject: Re: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric > Matrix. > > I made a typing error in the previous post, the first code segment is > this one instead: > > Asym = matrix(dot(A.T,A)) > > bsym = matrix(dot(A.T,b)) > > sol = cg(Asym,bsym,tol = 1e-10,maxiter=30) > > > > On Thu, Dec 9, 2010 at 9:12 PM, Raul Acu?a wrote: > >> > >> Hi, > >> Am using the iterative methods of scipy.sparse.linalg for solving a > linear system of equations Ax = b. My matrix A is non symmetric. I've been > using the scipy.sparse.linalg.cg() function multiplying both matrix "A" > and "b" with the transpose of A so the matrix will become symmetric: > >> Asym = matrix(dot(A.T,A)) > >> bsym = matrix(dot(A.T,b)) > >> sol = cg(A,b,tol = 1e-10,maxiter=30) > >> Also i've been reading about the biconjugate gradient method, and if i > am not mistaken the literature says that this method works on > non-symmetric matrix, but when i try to use scipy.sparse.linalg.bicg() it > wont work: > >> sol = bicg(A,b,tol = 1e-10,maxiter=30) > >> File > "C:\Python26\lib\site-packages\scipy\sparse\linalg\isolve\iterative.py", > line 74, in bicg > >> A,M,x,b,postprocess = make_system(A,M,x0,b,xtype) > >> File > "C:\Python26\lib\site-packages\scipy\sparse\linalg\isolve\utils.py", line > 65, in make_system > >> raise ValueError('expected square matrix (shape=%s)' % shape) > >> NameError: global name 'shape' is not defined > >> Any help will be greatly appreciated, am comparing this methods for my > data with an important emphasis on speed for my master thesis, so any > discrepancy with the theory would be a great problem for me. > >> Thanks in advance, > >> > >> Ra?l Acu?a. > > Hi Ra?l, > > You can't use Bi-CG or Bi-CGSTAB to solve non-square systems directly. Your > system is either under- or over-determined. What you may be meaning to solve > here is a related linear least-squares problem. If A has more rows than > columns, you have an over-determined system (more equations than unknowns) > and a relevant problem is to > > minimize 1/2 * ||Ax - b||^2. > > Conversely, if A has more columns than rows, you have an under-determined > system (more unknowns than conditions imposed on them) and you may want to > > minimize 1/2 * ||x||^2 subject to Ax=b > > i.e., find the least-norm solution among the infinitely many possibilities. > In both cases, you can use MINRES, available in SciPy I think, or else in > PyKrylov (https://github.com/dpo/pykrylov) and in NLPy (http://nlpy.sf.net) > by solving the larger system > > [ I A ] [ r ] = [ b ] > [ A.T 0 ] [ x ] [ 0 ] > > > (for the first problem) or > > [ I A.T ] [ x ] = [ 0 ] > [ A 0 ] [ r ] [ b ] > > > (for the second problem). Though the coefficient matrix above is symmetric, > it is indefinite, so you can't use CG! To use MINRES, you just need to write > a function which computes the product of a vector with the coefficient > matrix. > > In the first case, you can also use LSQR (available in NLPy and also > probably in SciPy). In this case, you'll just need to be able to do A*x and > A.T*y. > > -- > Dominique > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- Ing. Ra?l Acu?a Profesor @Universidad Sim?n Bol?var Grupo de Mecatr?nica Departamento de Electr?nica y Circuitos Tel. +58-212-4121983 / Cel +58-412-5840317 -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominique.orban at gmail.com Sat Dec 11 13:44:33 2010 From: dominique.orban at gmail.com (Dominique Orban) Date: Sat, 11 Dec 2010 13:44:33 -0500 Subject: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric Matrix. Message-ID: On Sat, Dec 11, 2010 at 1:00 PM, wrote: > ---------- Forwarded message ---------- > From: "Raul Acu?a" > To: SciPy Users List > Date: Fri, 10 Dec 2010 22:10:31 -0430 > Subject: Re: [SciPy-User] scipy.sparse.linalg.bicg for Non Symmetric > Matrix. > Hi Dominique, > > Thank you for your reply it was very helpful. I definitely have an > over-determined system, I need to find the solution in the least amount of > time possible, i've used the SVD method and now I'm experimenting with the > iterative methods, someone told me that they are faster. > > I used the Conjugate Gradient function from SciPy using a trick to convert > the original system: Ax = b to a system with a square matrix: > dot(A.T,A)x = dot(A.T,b). It is working and it finds a good solution faster > than the SVD method. However I dont know if this is the ideal method. > > Now, looking at your reply, and to your advice of using MINRES. I read that > this method guaranteed convergence, but in terms of speed is MINRES a > better method?. Also, sorry about my ignorance but i didnt understand this: > > [ I A ] [ r ] = [ b ] > [ A.T 0 ] [ x ] [ 0 ] > > > What is " r " in that system?. I want to try this method with my system > using the function from scipy and also try the one in PyKrylov. > > > Thanks again, > > best regards, > > Ra?l > Hi Ra?l, Sorry, perhaps I'm using notation that may not be clear to everyone. The notation [ I A ] [ r ] = [ b ] [ A.T 0 ] [ x ] [ 0 ] means that you're solving a linear system with a coefficient matrix defined by blocks (in Matlab notation, it would be something like [I, A ; A', 0]). If your A is m-by-n, this matrix is (m+n)-by(m+n). It is quite a bit larger but it is very sparse. The right-hand side is a vector of length m+n with your vector b in its first m components, and zero elsewhere. Similarly, the solution will be composed of two parts. The first segment of length m, which I call r, is the residual vector because you can deduce from the system that r = b - A*x. The second segment, x, will be the solution you are looking for. In your case, you will also want to try LSQR (available in NLPy). On paper, it is equivalent to applying CG to the square symmetric and positive semi-definite system dot(A.T,A)x = dot(A.T,b) but it should be more stable (and it never forms the matrix dot(A.T,A)). It may turn out to be more efficient than MINRES. Keep in mind that iterative methods are appropriate for large systems but they often require a good preconditioner to be effective. For an excellent account of all this, you may want to look at Yousef Saad's book "Iterative Methods for Sparse Linear Systems." Good luck, -- Dominique -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephens.js at gmail.com Sat Dec 11 16:15:54 2010 From: stephens.js at gmail.com (Scott Stephens) Date: Sat, 11 Dec 2010 15:15:54 -0600 Subject: [SciPy-User] Basic Question About numpy.dot Message-ID: Why does this work: >>> import numpy as np >>> np.dot(np.array([1.,2.]),np.array([[10.,20.],[0.1,0.2]])) array([ 10.2, 20.4]) But this doesn't: >>> import numpy as np >>> np.dot(np.array([1.,2.]),np.array([[10.]])) Traceback (most recent call last): File "", line 1, in ValueError: matrices are not aligned Thanks, Scott From asmund.hjulstad at gmail.com Sun Dec 12 08:41:38 2010 From: asmund.hjulstad at gmail.com (=?ISO-8859-1?Q?=C5smund_Hjulstad?=) Date: Sun, 12 Dec 2010 16:41:38 +0300 Subject: [SciPy-User] Basic Question About numpy.dot In-Reply-To: References: Message-ID: 2010/12/12 Scott Stephens > Why does this work: > > >>> import numpy as np > >>> np.dot(np.array([1.,2.]),np.array([[10.,20.],[0.1,0.2]])) > array([ 10.2, 20.4]) > > But this doesn't: > > >>> import numpy as np > >>> np.dot(np.array([1.,2.]),np.array([[10.]])) > Traceback (most recent call last): > File "", line 1, in > ValueError: matrices are not aligned > > For 2D matrices, np.dot is matrix multiplication, so the length of the first dimension (rows) of the first array must match the length of the second dimension (columns) of the second array. In your first example you have arrays with shape (2,), and (2,2) so the dimensions match. Your second example has dimensions (2,) and (1,1), so they don't match. (The matrices are not aligned.) Hope it answers your question. (You may also look at "help np.dot") ?smund Hjulstad -------------- next part -------------- An HTML attachment was scrubbed... URL: From anand.prabhakar.patil at gmail.com Mon Dec 13 06:34:15 2010 From: anand.prabhakar.patil at gmail.com (Anand Patil) Date: Mon, 13 Dec 2010 03:34:15 -0800 (PST) Subject: [SciPy-User] JOB: Contract to bundle NumPy/CUDA environment on an Amazon machine image Message-ID: <83329922-69e0-4f48-abb6-149caca60adb@k11g2000vbf.googlegroups.com> Dear all, My research group in the department of zoology at Oxford University (www.map.ox.ac.uk) is seeking a contractor to do the following on Amazon EC2's new Cluster GPU instance type: - Install Python 2.6 or 2.7, NumPy and SciPy (with multithreaded linear algebra), Matplotlib (make sure the TKAgg and PDF backends work), Basemap, PyTables, and PyCUDA. - Install MAGMA, http://icl.cs.utk.edu/magma/ as well as CUBLAS and CURAND (http://developer.nvidia.com/object/cuda_3_2_downloads.html) as shared libraries, so that they can be used from PyCUDA. - Test that the libraries listed above can be used from PyCUDA, and that they are producing correct results. - Take a snapshot of the instance as an Amazon Machine Image. - Keep comprehensive, reproducible notes on the build process for us. - Be available for up to 10 hours' support if we encounter problems in the future. We are offering $2700 US for the work listed above. If you are interested, please respond to Jennie Charlton (jennie.charlton at zoo.ox.ac.uk) and attach a resume with references. Please feel free to circulate this email to any colleagues who might be interested. Best wishes, Anand From jeremy at jeremysanders.net Mon Dec 13 09:06:40 2010 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Mon, 13 Dec 2010 14:06:40 +0000 Subject: [SciPy-User] ANN: Veusz 1.10 Message-ID: I'm pleased to announce Veusz 1.10, a Python-based scientific plotting package. Please see below for the release notes. Jeremy Veusz 1.10 ---------- Velvet Ember Under Sky Zenith ----------------------------- http://home.gna.org/veusz/ Veusz is Copyright (C) 2003-2010 Jeremy Sanders Licenced under the GPL (version 2 or greater). Veusz is a Qt4 based scientific plotting package. It is written in Python, using PyQt4 for display and user-interfaces, and numpy for handling the numeric data. Veusz is designed to produce publication-ready Postscript/PDF/SVG output. The user interface aims to be simple, consistent and powerful. Veusz provides a GUI, command line, embedding and scripting interface (based on Python) to its plotting facilities. It also allows for manipulation and editing of datasets. Data can be captured from external sources such as internet sockets or other programs. Changes in 1.10: * Box plot widget added, which can be given statistics to plot or calculated from datasets * Polar plot widget added * Datasets are now easier to construct and edit in the Data->Edit dialog box * CSV reader will assume a text dataset if it cannot convert first item to a number * Add color sequence plugin for making a range of widget colors * Import plugin for QDP files added * Date and times can be also written in local formats * Reload data dialog box can reload at intervals and is now non-modal * 2D datasets can be created based on expressions of other 2D datasets Minor changes: * Option to change size of ends of error bars * Margin size option added for key widget * Add --listen option to veusz command to replace veusz_listen. * Add --quiet option to run commands without displaying a window * Add --export option to export documents to graphics files and exit * PNG export compression increased * Add option to ignore number of lines after headers in CSV files Bug fixes: * Multiple datasets can now be properly created from dataset plugin dialog * X and Y ranges of 2D datasets are now correct when converted from X,Y,Z 1D datasets * Bounding boxes of resizing rectangles, ellipses and images are fixed * min and max coordinate range now works for plotting functions of y * Remove duplicate linked files when using import plugins * Several crash reports fixed * More robust code in data->edit dialog box * veusz_listen now works in Windows (not in binary package yet) Features of package: * X-Y plots (with errorbars) * Line and function plots * Contour plots * Images (with colour mappings and colorbars) * Stepped plots (for histograms) * Bar graphs * Vector field plots * Box plots * Polar plots * Plotting dates * Fitting functions to data * Stacked plots and arrays of plots * Plot keys * Plot labels * Shapes and arrows on plots * LaTeX-like formatting for text * EPS/PDF/PNG/SVG/EMF export * Scripting interface * Dataset creation/manipulation * Embed Veusz within other programs * Text, CSV, FITS and user-plugin importing * Data can be captured from external sources * User defined functions, constants and can import external Python functions * Plugin interface to allow user to write or load code to - import data using new formats - make new datasets, optionally linked to existing datasets - arbitrarily manipulate the document Requirements for source install: Python (2.4 or greater required) http://www.python.org/ Qt >= 4.3 (free edition) http://www.trolltech.com/products/qt/ PyQt >= 4.3 (SIP is required to be installed first) http://www.riverbankcomputing.co.uk/pyqt/ http://www.riverbankcomputing.co.uk/sip/ numpy >= 1.0 http://numpy.scipy.org/ Optional: Microsoft Core Fonts (recommended for nice output) http://corefonts.sourceforge.net/ PyFITS >= 1.1 (optional for FITS import) http://www.stsci.edu/resources/software_hardware/pyfits pyemf >= 2.0.0 (optional for EMF export) http://pyemf.sourceforge.net/ For EMF and better SVG export, PyQt >= 4.6 or better is required, to fix a bug in the C++ wrapping For documentation on using Veusz, see the "Documents" directory. The manual is in PDF, HTML and text format (generated from docbook). The examples are also useful documentation. Please also see and contribute to the Veusz wiki: http://barmag.net/veusz-wiki/ Issues with the current version: * Plots can sometimes be slow using antialiasing. Go to the preferences dialog or right click on the plot to disable antialiasing. * Some recent versions of PyQt/SIP will causes crashes when exporting SVG files. Update to 4.7.4 (if released) or a recent snapshot to solve this problem. If you enjoy using Veusz, I would love to hear from you. Please join the mailing lists at https://gna.org/mail/?group=veusz to discuss new features or if you'd like to contribute code. The latest code can always be found in the SVN repository. Jeremy Sanders From ralf.gommers at googlemail.com Mon Dec 13 11:12:32 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 14 Dec 2010 00:12:32 +0800 Subject: [SciPy-User] ANN: SciPy 0.9.0 beta 1 Message-ID: Hi, I am pleased to announce the availability of the first beta of SciPy 0.9.0. This will be the first SciPy release to include support for Python 3, as well as for Python 2.7. Please try this beta and report any problems on the scipy-dev mailing list. Binaries, sources and release notes can be found athttp://sourceforge.net/projects/scipy/files/scipy/0.9.0b1/. Note that not all binaries (win32-py27, *-macosx10.3) are uploaded yet, they will follow in the next day or two. There are still a few known issues (so no need to report these): 1. Arpack related errors on 64-bit OS X. 2. Correlate complex192 errors on Windows. 3. correlate/conjugate current behavior is deprecated and should be removed before RC1. Enjoy, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From laserson at mit.edu Mon Dec 13 11:38:54 2010 From: laserson at mit.edu (Uri Laserson) Date: Mon, 13 Dec 2010 11:38:54 -0500 Subject: [SciPy-User] Problems installing scipy on OS X 10.6 (Snow Leopard): libarpack In-Reply-To: References: Message-ID: Hi Ralf, Sor for my delayed response, I missed this message in my inbox. I have since moved forward with this problem, but I now have a runtime problem. I am using python2.7 I built myself through the homebrew package manager. Output from my compilers is as follows: laserson at hobbes:~$ gcc -v Using built-in specs. Target: i686-apple-darwin10 Configured with: /var/tmp/gcc/gcc-5664~89/src/configure --disable-checking --enable-werror --prefix=/usr --mandir=/share/man --enable-languages=c,objc,c++,obj-c++ --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 --with-gxx-include-dir=/include/c++/4.2.1 Thread model: posix gcc version 4.2.1 (Apple Inc. build 5664) laserson at hobbes:~$ g++ -v Using built-in specs. Target: i686-apple-darwin10 Configured with: /var/tmp/gcc/gcc-5664~89/src/configure --disable-checking --enable-werror --prefix=/usr --mandir=/share/man --enable-languages=c,objc,c++,obj-c++ --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 --with-gxx-include-dir=/include/c++/4.2.1 Thread model: posix gcc version 4.2.1 (Apple Inc. build 5664) laserson at hobbes:~$ gfortran -v Using built-in specs. Target: i686-apple-darwin8 Configured with: /Builds/unix/gcc/gcc-4.2/configure --prefix=/usr/local --mandir=/share/man --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --build=i686-apple-darwin8 --host=i686-apple-darwin8 --target=i686-apple-darwin8 --enable-languages=fortran Thread model: posix gcc version 4.2.3 I installed numpy and scipy loosely based on the directions here: http://mail.scipy.org/pipermail/numpy-discussion/2010-August/052227.html More specifically, after installing gfortran, I downloaded the following versions of numpy and scipy: numpy 1.5.1 scipy 0.8.0 I then set the following environment variables: export MACOSX_DEPLOYMENT_TARGET=10.6 export CFLAGS="-arch i386 -arch x86_64" export FFLAGS="-m32 -m64" export LDFLAGS="-Wall -undefined dynamic_lookup -bundle -arch i386 -arch x86_64 -framework Accelerate" Then I built and installed numpy as follows (note: sudo is not needed, as I took ownership of /usr/local): python setup.py build --fcompiler=gnu95 python setup.py install The results of numpy.test() are: >>> numpy.test() Running unit tests for numpy NumPy version 1.5.1 NumPy is installed in /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/numpy Python version 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 (Apple Inc. build 5664)] nose version 0.11.4 .... Ran 3006 tests in 18.312s OK (KNOWNFAIL=4, SKIP=1) Then I installed scipy as follows: python setup.py build --fcompiler=gnu95 python setup.py install and ran the tests, giving output: >>> scipy.test() Running unit tests for scipy NumPy version 1.5.1 NumPy is installed in /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/numpy SciPy version 0.8.0 SciPy is installed in /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/scipy Python version 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 (Apple Inc. build 5664)] nose version 0.11.4 RuntimeError: module compiled against ABI version 2000000 but this version of numpy is 1000009 .... Ran 4405 tests in 87.505s OK (KNOWNFAIL=19, SKIP=28) Note the RuntimeError listed above: RuntimeError: module compiled against ABI version 2000000 but this version of numpy is 1000009 I can still import scipy fine. However, I then have a problem after building matplotlib (GitHub). I can build it fine: ============================================================================ BUILDING MATPLOTLIB matplotlib: 1.0.0 python: 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 (Apple Inc. build 5664)] platform: darwin REQUIRED DEPENDENCIES numpy: 1.5.1 freetype2: 12.0.6 OPTIONAL BACKEND DEPENDENCIES libpng: 1.2.44 Tkinter: Tkinter: 81008, Tk: 8.5, Tcl: 8.5 Gtk+: no * Building for Gtk+ requires pygtk; you must be able * to "import gtk" in your build/install environment Mac OS X native: yes Qt: no Qt4: no Cairo: no However, when I import matplotlib.pyplot, I get: >>> import matplotlib.pyplot RuntimeError: module compiled against ABI version 2000000 but this version of numpy is 1000009 Traceback (most recent call last): File "", line 1, in File "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/pyplot.py", line 23, in from matplotlib.figure import Figure, figaspect File "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/figure.py", line 16, in import artist File "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/artist.py", line 6, in from transforms import Bbox, IdentityTransform, TransformedBbox, TransformedPath File "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/transforms.py", line 34, in from matplotlib._path import affine_transform ImportError: numpy.core.multiarray failed to import However, when I separately try to import numpy.core.multiarray, I have no problem. Any ideas? Thanks! Uri On Wed, Dec 8, 2010 at 06:18, Ralf Gommers wrote: > > > On Wed, Dec 8, 2010 at 4:06 AM, Uri Laserson wrote: > >> Hi all, >> >> I am on a MacMini with Intel processor. I just installed OS X 10.6 and >> the latest Xcode that I could download, which included gcc 4.2. I am using >> python 2.7 built from source using homebrew. I installed the gfortran 4.2.3 >> binaries from http://r.research.att.com/tools/. >> >> I am trying to install numpy and scipy. numpy installs fine with or >> without switching to g++-4.0. I have successfully installed it using pip >> and also directly from source from the git repository. >> >> Scipy is giving me errors on install (the same errors whether I use pip or >> try the svn repository). I installed it successfully yesterday on a new >> Macbook Air using pip, after changing the symlinks to point to g++-4.0. >> However, today on my MacMini, I am getting errors after following the same >> protocol. >> >> The errors I am getting are here: >> https://gist.github.com/732293 >> > > The error indicates that 32 and 64 bit binaries are being mixed. Can you > tell us the following: > - what build command you used > - what Python you are using (from python.org, from Apple, self-compiled?) > - the output of "gcc -v", "g++ -v" and "gfortran -v" > > Ralf > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Mon Dec 13 11:43:44 2010 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 13 Dec 2010 10:43:44 -0600 Subject: [SciPy-User] [SciPy-Dev] ANN: SciPy 0.9.0 beta 1 In-Reply-To: References: Message-ID: On Mon, Dec 13, 2010 at 10:12 AM, Ralf Gommers wrote: > Hi, > > I am pleased to announce the availability of the first beta of SciPy 0.9.0. > This will be the first SciPy release to include support for Python 3, as > well as for Python 2.7. Please try this beta and report any problems on > the scipy-dev mailing list. > > Binaries, sources and release notes can be found athttp://sourceforge.net/projects/scipy/files/scipy/0.9.0b1/. > Note that not all binaries (win32-py27, *-macosx10.3) are uploaded yet, they > will follow in the next day or two. > > There are still a few known issues (so no need to report these): > 1. Arpack related errors on 64-bit OS X. > 2. Correlate complex192 errors on Windows. > 3. correlate/conjugate current behavior is deprecated and should be removed > before RC1. > > Enjoy, > Ralf > > Just did a clean rebuild (after a clean rebuild of numpy) and had two errors in the tests: ====================================================================== FAIL: test_imresize (test_pilutil.TestPILUtil) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/bvr/Programs/numpy/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/home/bvr/Programs/scipy/scipy/misc/tests/test_pilutil.py", line 25, in test_imresize assert_equal(im1.shape,(11,22)) File "/home/bvr/Programs/numpy/numpy/testing/utils.py", line 251, in assert_equal assert_equal(actual[k], desired[k], 'item=%r\n%s' % (k,err_msg), verbose) File "/home/bvr/Programs/numpy/numpy/testing/utils.py", line 313, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: item=0 ACTUAL: 10 DESIRED: 11 ====================================================================== FAIL: test_basic (test_signaltools.TestMedFilt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/bvr/Programs/scipy/scipy/signal/tests/test_signaltools.py", line 284, in test_basic [ 0, 7, 11, 7, 4, 4, 19, 19, 24, 0]]) File "/home/bvr/Programs/numpy/numpy/testing/utils.py", line 686, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/home/bvr/Programs/numpy/numpy/testing/utils.py", line 618, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not equal (mismatch 8.0%) x: array([[ 0., 50., 50., 50., 42., 15., 15., 18., 27., 0.], [ 0., 50., 50., 50., 50., 42., 19., 21., 29., 0.], [ 50., 50., 50., 50., 50., 47., 34., 34., 46., 35.],... y: array([[ 0, 50, 50, 50, 42, 15, 15, 18, 27, 0], [ 0, 50, 50, 50, 50, 42, 19, 21, 29, 0], [50, 50, 50, 50, 50, 47, 34, 34, 46, 35],... ---------------------------------------------------------------------- Ran 4822 tests in 199.244s FAILED (KNOWNFAIL=12, SKIP=35, failures=2) Don't know about the first one, but the second one looks like a type-casting issue, because all the values are the same, except one is floating point and the other is integer. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From nlgunther at yahoo.com Mon Dec 13 12:17:29 2010 From: nlgunther at yahoo.com (Nicholas Gunther) Date: Mon, 13 Dec 2010 09:17:29 -0800 (PST) Subject: [SciPy-User] Problems with 64-bit Scipy Stats Message-ID: <808999.92535.qm@web43141.mail.sp1.yahoo.com> On my Windows 7, 64-bit AMD machine: from scipy import stats produces: Traceback (most recent call last): File "", line 1, in from scipy import stats File "C:\Python26\lib\site-packages\scipy\stats\__init__.py", line 7, in ` from stats import * File "C:\Python26\lib\site-packages\scipy\stats\stats.py", line 202, in import scipy.special as special File "C:\Python26\lib\site-packages\scipy\special\__init__.py", line 8, in from basic import * File "C:\Python26\lib\site-packages\scipy\special\basic.py", line 6, in from _cephes import * ImportError: DLL load failed: The specified module could not be found. ~~~ Since Scipy's covariance function works outside of the stats package, for single variable linear regression I can create a work-around, using cov(y,x)/var(x) for beta and the correlation coefficient squared for R squared. This is not ideal, however. Any suggestion of other workarounds, or how to fix this? Thanks! From cgohlke at uci.edu Mon Dec 13 12:48:23 2010 From: cgohlke at uci.edu (Christoph Gohlke) Date: Mon, 13 Dec 2010 09:48:23 -0800 Subject: [SciPy-User] Problems with 64-bit Scipy Stats In-Reply-To: <808999.92535.qm@web43141.mail.sp1.yahoo.com> References: <808999.92535.qm@web43141.mail.sp1.yahoo.com> Message-ID: <4D065C67.2020601@uci.edu> On 12/13/2010 9:17 AM, Nicholas Gunther wrote: > On my Windows 7, 64-bit AMD machine: > > from scipy import stats produces: > > Traceback (most recent call last): > File "", line 1, in > from scipy import stats > File "C:\Python26\lib\site-packages\scipy\stats\__init__.py", line 7, in` > from stats import * > File "C:\Python26\lib\site-packages\scipy\stats\stats.py", line 202, in > import scipy.special as special > File "C:\Python26\lib\site-packages\scipy\special\__init__.py", line 8, in > from basic import * > File "C:\Python26\lib\site-packages\scipy\special\basic.py", line 6, in > from _cephes import * > ImportError: DLL load failed: The specified module could not be found. > > ~~~ > Since Scipy's covariance function works outside of the stats package, for single variable linear regression I can create a work-around, using cov(y,x)/var(x) for beta and the correlation coefficient squared for R squared. This is not ideal, however. > Any suggestion of other workarounds, or how to fix this? > Thanks! In case you are using Scipy binaries built with Visual C and the Intel compiler suite, make sure to use a numpy build that uses the same version of Intel's MKL and that the MKL and Intel runtime libraries are found in the Windows DLL search path. -- Christoph From nlgunther at yahoo.com Mon Dec 13 13:47:00 2010 From: nlgunther at yahoo.com (Nicholas Gunther) Date: Mon, 13 Dec 2010 10:47:00 -0800 (PST) Subject: [SciPy-User] Problems with 64-bit Scipy Stats In-Reply-To: <4D065C67.2020601@uci.edu> Message-ID: <994460.64027.qm@web43144.mail.sp1.yahoo.com> scipy.cov works; scipy.stats fails as indicated. I am using scipy-0.8.0.win-amd64-py2.6 numpy-1.5.1.win-amd64-py2.6 mingw-get-inst-20101030 python-2.6.6.amd64 Windows 7 Home Premium AMD V120 Processor Thanks! Best Nick --- On Mon, 12/13/10, Christoph Gohlke wrote: > From: Christoph Gohlke > Subject: Re: [SciPy-User] Problems with 64-bit Scipy Stats > To: "SciPy Users List" > Date: Monday, December 13, 2010, 12:48 PM > > > On 12/13/2010 9:17 AM, Nicholas Gunther wrote: > > On my Windows 7, 64-bit AMD machine: > > > > from scipy import stats produces: > > > > Traceback (most recent call last): > >? ? File "", line 1, > in > >? ? ? from scipy import stats > >? ? File > "C:\Python26\lib\site-packages\scipy\stats\__init__.py", > line 7, in` > >? ? ? from stats import * > >? ? File > "C:\Python26\lib\site-packages\scipy\stats\stats.py", line > 202, in > >? ? ? import scipy.special as special > >? ? File > "C:\Python26\lib\site-packages\scipy\special\__init__.py", > line 8, in > >? ? ? from basic import * > >? ? File > "C:\Python26\lib\site-packages\scipy\special\basic.py", line > 6, in > >? ? ? from _cephes import * > > ImportError: DLL load failed: The specified module > could not be found. > > > > ~~~ > > Since Scipy's covariance function works outside of the > stats package, for single variable linear regression I can > create a work-around, using cov(y,x)/var(x) for beta and the > correlation coefficient squared for R squared.? This is > not ideal, however. > > Any suggestion of other workarounds, or how to fix > this? > > Thanks! > > > In case you are using Scipy binaries built with Visual C > and the Intel > compiler suite, make sure to use a numpy build that uses > the same > version of Intel's MKL and that the MKL and Intel runtime > libraries are > found in the Windows DLL search path. > > -- > Christoph > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cgohlke at uci.edu Mon Dec 13 14:02:06 2010 From: cgohlke at uci.edu (Christoph Gohlke) Date: Mon, 13 Dec 2010 11:02:06 -0800 Subject: [SciPy-User] Problems with 64-bit Scipy Stats In-Reply-To: <994460.64027.qm@web43144.mail.sp1.yahoo.com> References: <994460.64027.qm@web43144.mail.sp1.yahoo.com> Message-ID: <4D066DAE.1030701@uci.edu> In case those binaries are from my website, use the numpy MKL installer, numpy-1.5.1.win-amd64-py2.6-mkl.?exe. Christoph On 12/13/2010 10:47 AM, Nicholas Gunther wrote: > scipy.cov works; scipy.stats fails as indicated. > I am using > scipy-0.8.0.win-amd64-py2.6 > numpy-1.5.1.win-amd64-py2.6 > mingw-get-inst-20101030 > python-2.6.6.amd64 > Windows 7 Home Premium > AMD V120 Processor > > Thanks! > > Best > Nick > > > > > > --- On Mon, 12/13/10, Christoph Gohlke wrote: > >> From: Christoph Gohlke >> Subject: Re: [SciPy-User] Problems with 64-bit Scipy Stats >> To: "SciPy Users List" >> Date: Monday, December 13, 2010, 12:48 PM >> >> >> On 12/13/2010 9:17 AM, Nicholas Gunther wrote: >>> On my Windows 7, 64-bit AMD machine: >>> >>> from scipy import stats produces: >>> >>> Traceback (most recent call last): >>> File "", line 1, >> in >>> from scipy import stats >>> File >> "C:\Python26\lib\site-packages\scipy\stats\__init__.py", >> line 7, in` >>> from stats import * >>> File >> "C:\Python26\lib\site-packages\scipy\stats\stats.py", line >> 202, in >>> import scipy.special as special >>> File >> "C:\Python26\lib\site-packages\scipy\special\__init__.py", >> line 8, in >>> from basic import * >>> File >> "C:\Python26\lib\site-packages\scipy\special\basic.py", line >> 6, in >>> from _cephes import * >>> ImportError: DLL load failed: The specified module >> could not be found. >>> >>> ~~~ >>> Since Scipy's covariance function works outside of the >> stats package, for single variable linear regression I can >> create a work-around, using cov(y,x)/var(x) for beta and the >> correlation coefficient squared for R squared. This is >> not ideal, however. >>> Any suggestion of other workarounds, or how to fix >> this? >>> Thanks! >> >> >> In case you are using Scipy binaries built with Visual C >> and the Intel >> compiler suite, make sure to use a numpy build that uses >> the same >> version of Intel's MKL and that the MKL and Intel runtime >> libraries are >> found in the Windows DLL search path. >> >> -- >> Christoph >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From nlgunther at yahoo.com Mon Dec 13 15:04:04 2010 From: nlgunther at yahoo.com (Nicholas Gunther) Date: Mon, 13 Dec 2010 12:04:04 -0800 (PST) Subject: [SciPy-User] Problems with 64-bit Scipy Stats In-Reply-To: <4D066DAE.1030701@uci.edu> Message-ID: <738915.34402.qm@web43139.mail.sp1.yahoo.com> Yes, numpy, scipy and matplotlib all from your website, for which I thanked you in an email from my gmail account. I'll try what you suggest. Regs, Nick --- On Mon, 12/13/10, Christoph Gohlke wrote: > From: Christoph Gohlke > Subject: Re: [SciPy-User] Problems with 64-bit Scipy Stats > To: "SciPy Users List" > Date: Monday, December 13, 2010, 2:02 PM > In case those binaries are from my > website, use the numpy MKL installer, > numpy-1.5.1.win-amd64-py2.6-mkl.?exe. > > Christoph > > On 12/13/2010 10:47 AM, Nicholas Gunther wrote: > > scipy.cov works; scipy.stats fails as indicated. > > I am using > > scipy-0.8.0.win-amd64-py2.6 > > numpy-1.5.1.win-amd64-py2.6 > > mingw-get-inst-20101030 > > python-2.6.6.amd64 > > Windows 7 Home Premium > > AMD V120 Processor > > > > Thanks! > > > > Best > > Nick > > > > > > > > > > > > --- On Mon, 12/13/10, Christoph Gohlke? > wrote: > > > >> From: Christoph Gohlke > >> Subject: Re: [SciPy-User] Problems with 64-bit > Scipy Stats > >> To: "SciPy Users List" > >> Date: Monday, December 13, 2010, 12:48 PM > >> > >> > >> On 12/13/2010 9:17 AM, Nicholas Gunther wrote: > >>> On my Windows 7, 64-bit AMD machine: > >>> > >>> from scipy import stats produces: > >>> > >>> Traceback (most recent call last): > >>>? ? ? File "", > line 1, > >> in > >>>? ? ? ? from scipy import > stats > >>>? ? ? File > >> > "C:\Python26\lib\site-packages\scipy\stats\__init__.py", > >> line 7, in` > >>>? ? ? ? from stats import > * > >>>? ? ? File > >> > "C:\Python26\lib\site-packages\scipy\stats\stats.py", line > >> 202, in > >>>? ? ? ? import > scipy.special as special > >>>? ? ? File > >> > "C:\Python26\lib\site-packages\scipy\special\__init__.py", > >> line 8, in > >>>? ? ? ? from basic import > * > >>>? ? ? File > >> > "C:\Python26\lib\site-packages\scipy\special\basic.py", > line > >> 6, in > >>>? ? ? ? from _cephes import > * > >>> ImportError: DLL load failed: The specified > module > >> could not be found. > >>> > >>> ~~~ > >>> Since Scipy's covariance function works > outside of the > >> stats package, for single variable linear > regression I can > >> create a work-around, using cov(y,x)/var(x) for > beta and the > >> correlation coefficient squared for R > squared.? This is > >> not ideal, however. > >>> Any suggestion of other workarounds, or how to > fix > >> this? > >>> Thanks! > >> > >> > >> In case you are using Scipy binaries built with > Visual C > >> and the Intel > >> compiler suite, make sure to use a numpy build > that uses > >> the same > >> version of Intel's MKL and that the MKL and Intel > runtime > >> libraries are > >> found in the Windows DLL search path. > >> > >> -- > >> Christoph > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From nlgunther at yahoo.com Mon Dec 13 15:52:14 2010 From: nlgunther at yahoo.com (Nicholas Gunther) Date: Mon, 13 Dec 2010 12:52:14 -0800 (PST) Subject: [SciPy-User] Problems with 64-bit Scipy Stats In-Reply-To: <4D066DAE.1030701@uci.edu> Message-ID: <90240.91869.qm@web43143.mail.sp1.yahoo.com> That seemed to fix it. And, of course, now that you have pointed me to it, your page says very clearly that SciPy requires the MKL installer for Numpy. I missed it the first time in part out of my ignorance of what 'MKL' means. I apologize for wasting your time by not reading and understanding your installation page better, and I thank you very much for making these packages available on 64-bit architecture, and particularly for helping me install them correctly. Thank you again. --- On Mon, 12/13/10, Christoph Gohlke wrote: > From: Christoph Gohlke > Subject: Re: [SciPy-User] Problems with 64-bit Scipy Stats > To: "SciPy Users List" > Date: Monday, December 13, 2010, 2:02 PM > In case those binaries are from my > website, use the numpy MKL installer, > numpy-1.5.1.win-amd64-py2.6-mkl.?exe. > > Christoph > > On 12/13/2010 10:47 AM, Nicholas Gunther wrote: > > scipy.cov works; scipy.stats fails as indicated. > > I am using > > scipy-0.8.0.win-amd64-py2.6 > > numpy-1.5.1.win-amd64-py2.6 > > mingw-get-inst-20101030 > > python-2.6.6.amd64 > > Windows 7 Home Premium > > AMD V120 Processor > > > > Thanks! > > > > Best > > Nick > > > > > > > > > > > > --- On Mon, 12/13/10, Christoph Gohlke? > wrote: > > > >> From: Christoph Gohlke > >> Subject: Re: [SciPy-User] Problems with 64-bit > Scipy Stats > >> To: "SciPy Users List" > >> Date: Monday, December 13, 2010, 12:48 PM > >> > >> > >> On 12/13/2010 9:17 AM, Nicholas Gunther wrote: > >>> On my Windows 7, 64-bit AMD machine: > >>> > >>> from scipy import stats produces: > >>> > >>> Traceback (most recent call last): > >>>? ? ? File "", > line 1, > >> in > >>>? ? ? ? from scipy import > stats > >>>? ? ? File > >> > "C:\Python26\lib\site-packages\scipy\stats\__init__.py", > >> line 7, in` > >>>? ? ? ? from stats import > * > >>>? ? ? File > >> > "C:\Python26\lib\site-packages\scipy\stats\stats.py", line > >> 202, in > >>>? ? ? ? import > scipy.special as special > >>>? ? ? File > >> > "C:\Python26\lib\site-packages\scipy\special\__init__.py", > >> line 8, in > >>>? ? ? ? from basic import > * > >>>? ? ? File > >> > "C:\Python26\lib\site-packages\scipy\special\basic.py", > line > >> 6, in > >>>? ? ? ? from _cephes import > * > >>> ImportError: DLL load failed: The specified > module > >> could not be found. > >>> > >>> ~~~ > >>> Since Scipy's covariance function works > outside of the > >> stats package, for single variable linear > regression I can > >> create a work-around, using cov(y,x)/var(x) for > beta and the > >> correlation coefficient squared for R > squared.? This is > >> not ideal, however. > >>> Any suggestion of other workarounds, or how to > fix > >> this? > >>> Thanks! > >> > >> > >> In case you are using Scipy binaries built with > Visual C > >> and the Intel > >> compiler suite, make sure to use a numpy build > that uses > >> the same > >> version of Intel's MKL and that the MKL and Intel > runtime > >> libraries are > >> found in the Windows DLL search path. > >> > >> -- > >> Christoph > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ferrell at diablotech.com Mon Dec 13 17:33:33 2010 From: ferrell at diablotech.com (Robert Ferrell) Date: Mon, 13 Dec 2010 15:33:33 -0700 Subject: [SciPy-User] Python in middle school Message-ID: I may have an opportunity to help a few middle school students learn python. Anybody have suggestions for fun, simple projects that involve tools in numpy/scipy/EPD that might be appropriate for (smart) 7th &/or 8th graders? Thanks, -robert From alan.isaac at gmail.com Mon Dec 13 17:50:11 2010 From: alan.isaac at gmail.com (Alan G Isaac) Date: Mon, 13 Dec 2010 17:50:11 -0500 Subject: [SciPy-User] Python in middle school In-Reply-To: References: Message-ID: <4D06A323.6060203@gmail.com> On 12/13/2010 5:33 PM, Robert Ferrell wrote: > I may have an opportunity to help a few middle school students learn python. > > Anybody have suggestions for fun, simple projects that involve tools in numpy/scipy/EPD that might be appropriate for (smart) 7th&/or 8th graders? > Must it go beyond core Python? If so, might PyGame be enough? http://inventwithpython.com/ fwiw, Alan Isaac From ferrell at diablotech.com Mon Dec 13 21:26:44 2010 From: ferrell at diablotech.com (Robert Ferrell) Date: Mon, 13 Dec 2010 19:26:44 -0700 Subject: [SciPy-User] Python in middle school In-Reply-To: <4D06A323.6060203@gmail.com> References: <4D06A323.6060203@gmail.com> Message-ID: <5BEBDDF4-B842-4F2B-881C-C5ED2A250B42@diablotech.com> That looks like a great suggestion. I'll look into it a bit more. thanks, -robert On Dec 13, 2010, at 3:50 PM, Alan G Isaac wrote: > On 12/13/2010 5:33 PM, Robert Ferrell wrote: >> I may have an opportunity to help a few middle school students learn python. >> >> Anybody have suggestions for fun, simple projects that involve tools in numpy/scipy/EPD that might be appropriate for (smart) 7th&/or 8th graders? >> > > > Must it go beyond core Python? > If so, might PyGame be enough? > http://inventwithpython.com/ > > fwiw, > Alan Isaac > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Mon Dec 13 22:03:53 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 13 Dec 2010 22:03:53 -0500 Subject: [SciPy-User] Python in middle school In-Reply-To: <5BEBDDF4-B842-4F2B-881C-C5ED2A250B42@diablotech.com> References: <4D06A323.6060203@gmail.com> <5BEBDDF4-B842-4F2B-881C-C5ED2A250B42@diablotech.com> Message-ID: On Mon, Dec 13, 2010 at 9:26 PM, Robert Ferrell wrote: > That looks like a great suggestion. ?I'll look into it a bit more. > > thanks, > -robert > > On Dec 13, 2010, at 3:50 PM, Alan G Isaac wrote: > >> On 12/13/2010 5:33 PM, Robert Ferrell wrote: >>> I may have an opportunity to help a few middle school students learn python. >>> >>> Anybody have suggestions for fun, simple projects that involve tools in numpy/scipy/EPD that might be appropriate for (smart) 7th&/or 8th graders? >>> >> >> >> Must it go beyond core Python? >> If so, might PyGame be enough? >> http://inventwithpython.com/ Also, I found this story interesting http://learnpython.wordpress.com/2010/12/12/python-in-the-news/ (At a grade school of one of my kids they use Lego Mindstorm, which sounds fun.) Josef >> >> fwiw, >> Alan Isaac >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ferrell at diablotech.com Mon Dec 13 23:20:58 2010 From: ferrell at diablotech.com (Robert Ferrell) Date: Mon, 13 Dec 2010 21:20:58 -0700 Subject: [SciPy-User] Python in middle school In-Reply-To: References: <4D06A323.6060203@gmail.com> <5BEBDDF4-B842-4F2B-881C-C5ED2A250B42@diablotech.com> Message-ID: <74ED4EF4-6A2D-4D66-9CFD-34622DD3DFA8@diablotech.com> On Dec 13, 2010, at 8:03 PM, josef.pktd at gmail.com wrote: > On Mon, Dec 13, 2010 at 9:26 PM, Robert Ferrell wrote: >> That looks like a great suggestion. I'll look into it a bit more. >> >> thanks, >> -robert >> >> On Dec 13, 2010, at 3:50 PM, Alan G Isaac wrote: >> >>> On 12/13/2010 5:33 PM, Robert Ferrell wrote: >>>> I may have an opportunity to help a few middle school students learn python. >>>> >>>> Anybody have suggestions for fun, simple projects that involve tools in numpy/scipy/EPD that might be appropriate for (smart) 7th&/or 8th graders? >>>> >>> >>> >>> Must it go beyond core Python? >>> If so, might PyGame be enough? >>> http://inventwithpython.com/ > > Also, I found this story interesting > http://learnpython.wordpress.com/2010/12/12/python-in-the-news/ > (At a grade school of one of my kids they use Lego Mindstorm, which sounds fun.) Another article in that guys blog, http://learnpython.wordpress.com/category/teaching/, also has some good info. thanks, -robert > From ralf.gommers at googlemail.com Tue Dec 14 06:08:02 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 14 Dec 2010 19:08:02 +0800 Subject: [SciPy-User] Problems installing scipy on OS X 10.6 (Snow Leopard): libarpack In-Reply-To: References: Message-ID: On Tue, Dec 14, 2010 at 12:38 AM, Uri Laserson wrote: > Hi Ralf, > > Sor for my delayed response, I missed this message in my inbox. > No problem at all. > I have since moved forward with this problem, but I now have a runtime > problem. > > I am using python2.7 I built myself through the homebrew package manager. > > Output from my compilers is as follows: > laserson at hobbes:~$ gcc -v > Using built-in specs. > Target: i686-apple-darwin10 > Configured with: /var/tmp/gcc/gcc-5664~89/src/configure --disable-checking > --enable-werror --prefix=/usr --mandir=/share/man > --enable-languages=c,objc,c++,obj-c++ > --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib > --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- > --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 > --with-gxx-include-dir=/include/c++/4.2.1 > Thread model: posix > gcc version 4.2.1 (Apple Inc. build 5664) > > laserson at hobbes:~$ g++ -v > Using built-in specs. > Target: i686-apple-darwin10 > Configured with: /var/tmp/gcc/gcc-5664~89/src/configure --disable-checking > --enable-werror --prefix=/usr --mandir=/share/man > --enable-languages=c,objc,c++,obj-c++ > --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib > --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- > --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 > --with-gxx-include-dir=/include/c++/4.2.1 > Thread model: posix > gcc version 4.2.1 (Apple Inc. build 5664) > > laserson at hobbes:~$ gfortran -v > Using built-in specs. > Target: i686-apple-darwin8 > Configured with: /Builds/unix/gcc/gcc-4.2/configure --prefix=/usr/local > --mandir=/share/man --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ > --build=i686-apple-darwin8 --host=i686-apple-darwin8 > --target=i686-apple-darwin8 --enable-languages=fortran > Thread model: posix > gcc version 4.2.3 > > I installed numpy and scipy loosely based on the directions here: > http://mail.scipy.org/pipermail/numpy-discussion/2010-August/052227.html > > I see that unfortunately that mail did not receive any response, I don't remember seeing it. The info in it is not entirely correct. > More specifically, after installing gfortran, I downloaded the following > versions of numpy and scipy: > numpy 1.5.1 > scipy 0.8.0 > scipy 0.8.0 has one issue with python 2.7. You should either use the 0.8.x svn branch or 0.9.0b1 from svn or Sourceforge. > > I then set the following environment variables: > export MACOSX_DEPLOYMENT_TARGET=10.6 > This is OK. > export CFLAGS="-arch i386 -arch x86_64" > export FFLAGS="-m32 -m64" > export LDFLAGS="-Wall -undefined dynamic_lookup -bundle -arch i386 -arch > x86_64 -framework Accelerate" > This is incorrect. When you build with distutils, CFLAGS/FFLAGS will overwrite all flags, not append them. You should just leave this out, default flags work fine with numpy 1.5.1 + scipy as specified above. > > Then I built and installed numpy as follows (note: sudo is not needed, as I > took ownership of /usr/local): > python setup.py build --fcompiler=gnu95 > python setup.py install > > The results of numpy.test() are: > >>> numpy.test() > Looks OK. As a sanity check, run numpy.test('full'). There are some extra distutils tests that get run like that. > Running unit tests for numpy > NumPy version 1.5.1 > NumPy is installed in > /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/numpy > Python version 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 (Apple > Inc. build 5664)] > nose version 0.11.4 > .... > Ran 3006 tests in 18.312s > > OK (KNOWNFAIL=4, SKIP=1) > > > Then I installed scipy as follows: > python setup.py build --fcompiler=gnu95 > python setup.py install > > and ran the tests, giving output: > >>> scipy.test() > Running unit tests for scipy > NumPy version 1.5.1 > NumPy is installed in > /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/numpy > SciPy version 0.8.0 > SciPy is installed in > /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/scipy > Python version 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 (Apple > Inc. build 5664)] > nose version 0.11.4 > RuntimeError: module compiled against ABI version 2000000 but this version > of numpy is 1000009 > .... > Ran 4405 tests in 87.505s > > OK (KNOWNFAIL=19, SKIP=28) > > > Note the RuntimeError listed above: > RuntimeError: module compiled against ABI version 2000000 but this version > of numpy is 1000009 > That looks like you have some other numpy install hanging around. > > I can still import scipy fine. However, I then have a problem after > building matplotlib (GitHub). I can build it fine: > > ============================================================================ > BUILDING MATPLOTLIB > matplotlib: 1.0.0 > python: 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC > 4.2.1 (Apple Inc. build 5664)] > platform: darwin > > REQUIRED DEPENDENCIES > numpy: 1.5.1 > freetype2: 12.0.6 > > OPTIONAL BACKEND DEPENDENCIES > libpng: 1.2.44 > Tkinter: Tkinter: 81008, Tk: 8.5, Tcl: 8.5 > Gtk+: no > * Building for Gtk+ requires pygtk; you must be > able > * to "import gtk" in your build/install environment > Mac OS X native: yes > Qt: no > Qt4: no > Cairo: no > > However, when I import matplotlib.pyplot, I get: > >>> import matplotlib.pyplot > RuntimeError: module compiled against ABI version 2000000 but this version > of numpy is 1000009 > Traceback (most recent call last): > File "", line 1, in > File > "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/pyplot.py", > line 23, in > from matplotlib.figure import Figure, figaspect > File > "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/figure.py", > line 16, in > import artist > File > "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/artist.py", > line 6, in > from transforms import Bbox, IdentityTransform, TransformedBbox, > TransformedPath > File > "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/transforms.py", > line 34, in > from matplotlib._path import affine_transform > ImportError: numpy.core.multiarray failed to import > > > However, when I separately try to import numpy.core.multiarray, I have no > problem. > > Same problem as with scipy I guess. Cheers, Ralf Any ideas? > > Thanks! > Uri > > > > > On Wed, Dec 8, 2010 at 06:18, Ralf Gommers wrote: > >> >> >> On Wed, Dec 8, 2010 at 4:06 AM, Uri Laserson wrote: >> >>> Hi all, >>> >>> I am on a MacMini with Intel processor. I just installed OS X 10.6 and >>> the latest Xcode that I could download, which included gcc 4.2. I am using >>> python 2.7 built from source using homebrew. I installed the gfortran 4.2.3 >>> binaries from http://r.research.att.com/tools/. >>> >>> I am trying to install numpy and scipy. numpy installs fine with or >>> without switching to g++-4.0. I have successfully installed it using pip >>> and also directly from source from the git repository. >>> >>> Scipy is giving me errors on install (the same errors whether I use pip >>> or try the svn repository). I installed it successfully yesterday on a new >>> Macbook Air using pip, after changing the symlinks to point to g++-4.0. >>> However, today on my MacMini, I am getting errors after following the same >>> protocol. >>> >>> The errors I am getting are here: >>> https://gist.github.com/732293 >>> >> >> The error indicates that 32 and 64 bit binaries are being mixed. Can you >> tell us the following: >> - what build command you used >> - what Python you are using (from python.org, from Apple, self-compiled?) >> - the output of "gcc -v", "g++ -v" and "gfortran -v" >> >> Ralf >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Dec 14 07:14:57 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 14 Dec 2010 20:14:57 +0800 Subject: [SciPy-User] [SciPy-Dev] ANN: SciPy 0.9.0 beta 1 In-Reply-To: References: Message-ID: On Tue, Dec 14, 2010 at 12:46 AM, Pauli Virtanen wrote: > Tue, 14 Dec 2010 00:12:32 +0800, Ralf Gommers wrote: > > I am pleased to announce the availability of the first beta of SciPy > > 0.9.0. This will be the first SciPy release to include support for > > Python 3, as well as for Python 2.7. > > Note: scipy.weave does not work on Python 3 yet; the other parts of > Scipy do. > > Do you (or anyone else) plan to work on this before the final release? If not I'll update the release notes. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From laserson at mit.edu Tue Dec 14 10:00:11 2010 From: laserson at mit.edu (Uri Laserson) Date: Tue, 14 Dec 2010 10:00:11 -0500 Subject: [SciPy-User] Problems installing scipy on OS X 10.6 (Snow Leopard): libarpack In-Reply-To: References: Message-ID: Thanks for the help. I will probably retry with scipy 0.9.0. Strangely enough, I did something on my computer and it worked. I think that what ended up happening was building numpy 1.5.1 with all the normal defaults on the computer. Building MPL on top of that was then fine. And finally, I used the special compile flags only for scipy 0.8.0. Both numpy as scipy pass numpy.test() and scipy.test() When I perform numpy.test('full') as you suggest, I get some fortran-related errors, like: ERROR: test_return_real.TestF90ReturnReal.test_all ERROR: test_return_integer.TestF77ReturnInteger.test_all etc. (12 in total) Are these significant? Uri ................................................................................... Uri Laserson Graduate Student, Biomedical Engineering Harvard-MIT Division of Health Sciences and Technology M +1 917 742 8019 laserson at mit.edu On Tue, Dec 14, 2010 at 06:08, Ralf Gommers wrote: > > > On Tue, Dec 14, 2010 at 12:38 AM, Uri Laserson wrote: > >> Hi Ralf, >> >> Sor for my delayed response, I missed this message in my inbox. >> > > No problem at all. > > >> I have since moved forward with this problem, but I now have a runtime >> problem. >> >> I am using python2.7 I built myself through the homebrew package manager. >> >> Output from my compilers is as follows: >> laserson at hobbes:~$ gcc -v >> Using built-in specs. >> Target: i686-apple-darwin10 >> Configured with: /var/tmp/gcc/gcc-5664~89/src/configure --disable-checking >> --enable-werror --prefix=/usr --mandir=/share/man >> --enable-languages=c,objc,c++,obj-c++ >> --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib >> --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- >> --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 >> --with-gxx-include-dir=/include/c++/4.2.1 >> Thread model: posix >> gcc version 4.2.1 (Apple Inc. build 5664) >> >> laserson at hobbes:~$ g++ -v >> Using built-in specs. >> Target: i686-apple-darwin10 >> Configured with: /var/tmp/gcc/gcc-5664~89/src/configure --disable-checking >> --enable-werror --prefix=/usr --mandir=/share/man >> --enable-languages=c,objc,c++,obj-c++ >> --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib >> --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- >> --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 >> --with-gxx-include-dir=/include/c++/4.2.1 >> Thread model: posix >> gcc version 4.2.1 (Apple Inc. build 5664) >> >> laserson at hobbes:~$ gfortran -v >> Using built-in specs. >> Target: i686-apple-darwin8 >> Configured with: /Builds/unix/gcc/gcc-4.2/configure --prefix=/usr/local >> --mandir=/share/man --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ >> --build=i686-apple-darwin8 --host=i686-apple-darwin8 >> --target=i686-apple-darwin8 --enable-languages=fortran >> Thread model: posix >> gcc version 4.2.3 >> >> I installed numpy and scipy loosely based on the directions here: >> http://mail.scipy.org/pipermail/numpy-discussion/2010-August/052227.html >> >> I see that unfortunately that mail did not receive any response, I don't > remember seeing it. The info in it is not entirely correct. > > >> More specifically, after installing gfortran, I downloaded the following >> versions of numpy and scipy: >> numpy 1.5.1 >> scipy 0.8.0 >> > > scipy 0.8.0 has one issue with python 2.7. You should either use the 0.8.x > svn branch or 0.9.0b1 from svn or Sourceforge. > >> >> I then set the following environment variables: >> export MACOSX_DEPLOYMENT_TARGET=10.6 >> > This is OK. > > >> export CFLAGS="-arch i386 -arch x86_64" >> export FFLAGS="-m32 -m64" >> export LDFLAGS="-Wall -undefined dynamic_lookup -bundle -arch i386 -arch >> x86_64 -framework Accelerate" >> > > This is incorrect. When you build with distutils, CFLAGS/FFLAGS will > overwrite all flags, not append them. You should just leave this out, > default flags work fine with numpy 1.5.1 + scipy as specified above. > > >> >> Then I built and installed numpy as follows (note: sudo is not needed, as >> I took ownership of /usr/local): >> python setup.py build --fcompiler=gnu95 >> python setup.py install >> >> The results of numpy.test() are: >> >>> numpy.test() >> > > Looks OK. As a sanity check, run numpy.test('full'). There are some extra > distutils tests that get run like that. > > >> Running unit tests for numpy >> NumPy version 1.5.1 >> NumPy is installed in >> /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/numpy >> Python version 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 (Apple >> Inc. build 5664)] >> nose version 0.11.4 >> .... >> Ran 3006 tests in 18.312s >> >> OK (KNOWNFAIL=4, SKIP=1) >> >> >> Then I installed scipy as follows: >> python setup.py build --fcompiler=gnu95 >> python setup.py install >> >> and ran the tests, giving output: >> >>> scipy.test() >> Running unit tests for scipy >> NumPy version 1.5.1 >> NumPy is installed in >> /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/numpy >> SciPy version 0.8.0 >> SciPy is installed in >> /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/scipy >> Python version 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 (Apple >> Inc. build 5664)] >> nose version 0.11.4 >> RuntimeError: module compiled against ABI version 2000000 but this version >> of numpy is 1000009 >> .... >> Ran 4405 tests in 87.505s >> >> OK (KNOWNFAIL=19, SKIP=28) >> >> >> Note the RuntimeError listed above: >> RuntimeError: module compiled against ABI version 2000000 but this version >> of numpy is 1000009 >> > > That looks like you have some other numpy install hanging around. > > >> >> I can still import scipy fine. However, I then have a problem after >> building matplotlib (GitHub). I can build it fine: >> >> ============================================================================ >> BUILDING MATPLOTLIB >> matplotlib: 1.0.0 >> python: 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC >> 4.2.1 (Apple Inc. build 5664)] >> platform: darwin >> >> REQUIRED DEPENDENCIES >> numpy: 1.5.1 >> freetype2: 12.0.6 >> >> OPTIONAL BACKEND DEPENDENCIES >> libpng: 1.2.44 >> Tkinter: Tkinter: 81008, Tk: 8.5, Tcl: 8.5 >> Gtk+: no >> * Building for Gtk+ requires pygtk; you must be >> able >> * to "import gtk" in your build/install >> environment >> Mac OS X native: yes >> Qt: no >> Qt4: no >> Cairo: no >> >> However, when I import matplotlib.pyplot, I get: >> >>> import matplotlib.pyplot >> RuntimeError: module compiled against ABI version 2000000 but this version >> of numpy is 1000009 >> Traceback (most recent call last): >> File "", line 1, in >> File >> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/pyplot.py", >> line 23, in >> from matplotlib.figure import Figure, figaspect >> File >> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/figure.py", >> line 16, in >> import artist >> File >> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/artist.py", >> line 6, in >> from transforms import Bbox, IdentityTransform, TransformedBbox, >> TransformedPath >> File >> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/transforms.py", >> line 34, in >> from matplotlib._path import affine_transform >> ImportError: numpy.core.multiarray failed to import >> >> >> However, when I separately try to import numpy.core.multiarray, I have no >> problem. >> >> Same problem as with scipy I guess. > > Cheers, > Ralf > > > Any ideas? >> >> Thanks! >> Uri >> >> >> >> >> On Wed, Dec 8, 2010 at 06:18, Ralf Gommers wrote: >> >>> >>> >>> On Wed, Dec 8, 2010 at 4:06 AM, Uri Laserson wrote: >>> >>>> Hi all, >>>> >>>> I am on a MacMini with Intel processor. I just installed OS X 10.6 and >>>> the latest Xcode that I could download, which included gcc 4.2. I am using >>>> python 2.7 built from source using homebrew. I installed the gfortran 4.2.3 >>>> binaries from http://r.research.att.com/tools/. >>>> >>>> I am trying to install numpy and scipy. numpy installs fine with or >>>> without switching to g++-4.0. I have successfully installed it using pip >>>> and also directly from source from the git repository. >>>> >>>> Scipy is giving me errors on install (the same errors whether I use pip >>>> or try the svn repository). I installed it successfully yesterday on a new >>>> Macbook Air using pip, after changing the symlinks to point to g++-4.0. >>>> However, today on my MacMini, I am getting errors after following the same >>>> protocol. >>>> >>>> The errors I am getting are here: >>>> https://gist.github.com/732293 >>>> >>> >>> The error indicates that 32 and 64 bit binaries are being mixed. Can you >>> tell us the following: >>> - what build command you used >>> - what Python you are using (from python.org, from Apple, >>> self-compiled?) >>> - the output of "gcc -v", "g++ -v" and "gfortran -v" >>> >>> Ralf >>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Dec 14 10:20:27 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 14 Dec 2010 23:20:27 +0800 Subject: [SciPy-User] Problems installing scipy on OS X 10.6 (Snow Leopard): libarpack In-Reply-To: References: Message-ID: On Tue, Dec 14, 2010 at 11:00 PM, Uri Laserson wrote: > Thanks for the help. I will probably retry with scipy 0.9.0. Strangely > enough, I did something on my computer and it worked. I think that what > ended up happening was building numpy 1.5.1 with all the normal defaults on > the computer. Building MPL on top of that was then fine. And finally, I > used the special compile flags only for scipy 0.8.0. Both numpy as scipy > pass numpy.test() and scipy.test() > > When I perform numpy.test('full') as you suggest, I get some > fortran-related errors, like: > ERROR: test_return_real.TestF90ReturnReal.test_all > ERROR: test_return_integer.TestF77ReturnInteger.test_all > etc. (12 in total) > > Are these significant? > If it all works for you probably not, but there was an issue with those errors a few months ago. They are related to 32/64-bit architecture errors as well. If you have any problem with that again, please send me your full build and test logs. Cheers, Ralf > Uri > > > ................................................................................... > Uri Laserson > Graduate Student, Biomedical Engineering > Harvard-MIT Division of Health Sciences and Technology > M +1 917 742 8019 > laserson at mit.edu > > > > On Tue, Dec 14, 2010 at 06:08, Ralf Gommers wrote: > >> >> >> On Tue, Dec 14, 2010 at 12:38 AM, Uri Laserson wrote: >> >>> Hi Ralf, >>> >>> Sor for my delayed response, I missed this message in my inbox. >>> >> >> No problem at all. >> >> >>> I have since moved forward with this problem, but I now have a runtime >>> problem. >>> >>> I am using python2.7 I built myself through the homebrew package manager. >>> >>> Output from my compilers is as follows: >>> laserson at hobbes:~$ gcc -v >>> Using built-in specs. >>> Target: i686-apple-darwin10 >>> Configured with: /var/tmp/gcc/gcc-5664~89/src/configure >>> --disable-checking --enable-werror --prefix=/usr --mandir=/share/man >>> --enable-languages=c,objc,c++,obj-c++ >>> --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib >>> --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- >>> --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 >>> --with-gxx-include-dir=/include/c++/4.2.1 >>> Thread model: posix >>> gcc version 4.2.1 (Apple Inc. build 5664) >>> >>> laserson at hobbes:~$ g++ -v >>> Using built-in specs. >>> Target: i686-apple-darwin10 >>> Configured with: /var/tmp/gcc/gcc-5664~89/src/configure >>> --disable-checking --enable-werror --prefix=/usr --mandir=/share/man >>> --enable-languages=c,objc,c++,obj-c++ >>> --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib >>> --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- >>> --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 >>> --with-gxx-include-dir=/include/c++/4.2.1 >>> Thread model: posix >>> gcc version 4.2.1 (Apple Inc. build 5664) >>> >>> laserson at hobbes:~$ gfortran -v >>> Using built-in specs. >>> Target: i686-apple-darwin8 >>> Configured with: /Builds/unix/gcc/gcc-4.2/configure --prefix=/usr/local >>> --mandir=/share/man --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ >>> --build=i686-apple-darwin8 --host=i686-apple-darwin8 >>> --target=i686-apple-darwin8 --enable-languages=fortran >>> Thread model: posix >>> gcc version 4.2.3 >>> >>> I installed numpy and scipy loosely based on the directions here: >>> http://mail.scipy.org/pipermail/numpy-discussion/2010-August/052227.html >>> >>> I see that unfortunately that mail did not receive any response, I don't >> remember seeing it. The info in it is not entirely correct. >> >> >>> More specifically, after installing gfortran, I downloaded the following >>> versions of numpy and scipy: >>> numpy 1.5.1 >>> scipy 0.8.0 >>> >> >> scipy 0.8.0 has one issue with python 2.7. You should either use the 0.8.x >> svn branch or 0.9.0b1 from svn or Sourceforge. >> >>> >>> I then set the following environment variables: >>> export MACOSX_DEPLOYMENT_TARGET=10.6 >>> >> This is OK. >> >> >>> export CFLAGS="-arch i386 -arch x86_64" >>> export FFLAGS="-m32 -m64" >>> export LDFLAGS="-Wall -undefined dynamic_lookup -bundle -arch i386 -arch >>> x86_64 -framework Accelerate" >>> >> >> This is incorrect. When you build with distutils, CFLAGS/FFLAGS will >> overwrite all flags, not append them. You should just leave this out, >> default flags work fine with numpy 1.5.1 + scipy as specified above. >> >> >>> >>> Then I built and installed numpy as follows (note: sudo is not needed, as >>> I took ownership of /usr/local): >>> python setup.py build --fcompiler=gnu95 >>> python setup.py install >>> >>> The results of numpy.test() are: >>> >>> numpy.test() >>> >> >> Looks OK. As a sanity check, run numpy.test('full'). There are some extra >> distutils tests that get run like that. >> >> >>> Running unit tests for numpy >>> NumPy version 1.5.1 >>> NumPy is installed in >>> /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/numpy >>> Python version 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 >>> (Apple Inc. build 5664)] >>> nose version 0.11.4 >>> .... >>> Ran 3006 tests in 18.312s >>> >>> OK (KNOWNFAIL=4, SKIP=1) >>> >>> >>> Then I installed scipy as follows: >>> python setup.py build --fcompiler=gnu95 >>> python setup.py install >>> >>> and ran the tests, giving output: >>> >>> scipy.test() >>> Running unit tests for scipy >>> NumPy version 1.5.1 >>> NumPy is installed in >>> /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/numpy >>> SciPy version 0.8.0 >>> SciPy is installed in >>> /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/scipy >>> Python version 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 >>> (Apple Inc. build 5664)] >>> nose version 0.11.4 >>> RuntimeError: module compiled against ABI version 2000000 but this >>> version of numpy is 1000009 >>> .... >>> Ran 4405 tests in 87.505s >>> >>> OK (KNOWNFAIL=19, SKIP=28) >>> >>> >>> Note the RuntimeError listed above: >>> RuntimeError: module compiled against ABI version 2000000 but this >>> version of numpy is 1000009 >>> >> >> That looks like you have some other numpy install hanging around. >> >> >>> >>> I can still import scipy fine. However, I then have a problem after >>> building matplotlib (GitHub). I can build it fine: >>> >>> ============================================================================ >>> BUILDING MATPLOTLIB >>> matplotlib: 1.0.0 >>> python: 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC >>> 4.2.1 (Apple Inc. build 5664)] >>> platform: darwin >>> >>> REQUIRED DEPENDENCIES >>> numpy: 1.5.1 >>> freetype2: 12.0.6 >>> >>> OPTIONAL BACKEND DEPENDENCIES >>> libpng: 1.2.44 >>> Tkinter: Tkinter: 81008, Tk: 8.5, Tcl: 8.5 >>> Gtk+: no >>> * Building for Gtk+ requires pygtk; you must be >>> able >>> * to "import gtk" in your build/install >>> environment >>> Mac OS X native: yes >>> Qt: no >>> Qt4: no >>> Cairo: no >>> >>> However, when I import matplotlib.pyplot, I get: >>> >>> import matplotlib.pyplot >>> RuntimeError: module compiled against ABI version 2000000 but this >>> version of numpy is 1000009 >>> Traceback (most recent call last): >>> File "", line 1, in >>> File >>> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/pyplot.py", >>> line 23, in >>> from matplotlib.figure import Figure, figaspect >>> File >>> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/figure.py", >>> line 16, in >>> import artist >>> File >>> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/artist.py", >>> line 6, in >>> from transforms import Bbox, IdentityTransform, TransformedBbox, >>> TransformedPath >>> File >>> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/transforms.py", >>> line 34, in >>> from matplotlib._path import affine_transform >>> ImportError: numpy.core.multiarray failed to import >>> >>> >>> However, when I separately try to import numpy.core.multiarray, I have no >>> problem. >>> >>> Same problem as with scipy I guess. >> >> Cheers, >> Ralf >> >> >> Any ideas? >>> >>> Thanks! >>> Uri >>> >>> >>> >>> >>> On Wed, Dec 8, 2010 at 06:18, Ralf Gommers wrote: >>> >>>> >>>> >>>> On Wed, Dec 8, 2010 at 4:06 AM, Uri Laserson wrote: >>>> >>>>> Hi all, >>>>> >>>>> I am on a MacMini with Intel processor. I just installed OS X 10.6 and >>>>> the latest Xcode that I could download, which included gcc 4.2. I am using >>>>> python 2.7 built from source using homebrew. I installed the gfortran 4.2.3 >>>>> binaries from http://r.research.att.com/tools/. >>>>> >>>>> I am trying to install numpy and scipy. numpy installs fine with or >>>>> without switching to g++-4.0. I have successfully installed it using pip >>>>> and also directly from source from the git repository. >>>>> >>>>> Scipy is giving me errors on install (the same errors whether I use pip >>>>> or try the svn repository). I installed it successfully yesterday on a new >>>>> Macbook Air using pip, after changing the symlinks to point to g++-4.0. >>>>> However, today on my MacMini, I am getting errors after following the same >>>>> protocol. >>>>> >>>>> The errors I am getting are here: >>>>> https://gist.github.com/732293 >>>>> >>>> >>>> The error indicates that 32 and 64 bit binaries are being mixed. Can you >>>> tell us the following: >>>> - what build command you used >>>> - what Python you are using (from python.org, from Apple, >>>> self-compiled?) >>>> - the output of "gcc -v", "g++ -v" and "gfortran -v" >>>> >>>> Ralf >>>> >>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From laserson at mit.edu Tue Dec 14 10:23:58 2010 From: laserson at mit.edu (Uri Laserson) Date: Tue, 14 Dec 2010 10:23:58 -0500 Subject: [SciPy-User] Problems installing scipy on OS X 10.6 (Snow Leopard): libarpack In-Reply-To: References: Message-ID: Will do. Thanks again for all the help! Uri ................................................................................... Uri Laserson Graduate Student, Biomedical Engineering Harvard-MIT Division of Health Sciences and Technology M +1 917 742 8019 laserson at mit.edu On Tue, Dec 14, 2010 at 10:20, Ralf Gommers wrote: > > > On Tue, Dec 14, 2010 at 11:00 PM, Uri Laserson wrote: > >> Thanks for the help. I will probably retry with scipy 0.9.0. Strangely >> enough, I did something on my computer and it worked. I think that what >> ended up happening was building numpy 1.5.1 with all the normal defaults on >> the computer. Building MPL on top of that was then fine. And finally, I >> used the special compile flags only for scipy 0.8.0. Both numpy as scipy >> pass numpy.test() and scipy.test() >> >> When I perform numpy.test('full') as you suggest, I get some >> fortran-related errors, like: >> ERROR: test_return_real.TestF90ReturnReal.test_all >> ERROR: test_return_integer.TestF77ReturnInteger.test_all >> etc. (12 in total) >> >> Are these significant? >> > > If it all works for you probably not, but there was an issue with those > errors a few months ago. They are related to 32/64-bit architecture errors > as well. If you have any problem with that again, please send me your full > build and test logs. > > Cheers, > Ralf > > > >> Uri >> >> >> ................................................................................... >> Uri Laserson >> Graduate Student, Biomedical Engineering >> Harvard-MIT Division of Health Sciences and Technology >> M +1 917 742 8019 >> laserson at mit.edu >> >> >> >> On Tue, Dec 14, 2010 at 06:08, Ralf Gommers wrote: >> >>> >>> >>> On Tue, Dec 14, 2010 at 12:38 AM, Uri Laserson wrote: >>> >>>> Hi Ralf, >>>> >>>> Sor for my delayed response, I missed this message in my inbox. >>>> >>> >>> No problem at all. >>> >>> >>>> I have since moved forward with this problem, but I now have a runtime >>>> problem. >>>> >>>> I am using python2.7 I built myself through the homebrew package >>>> manager. >>>> >>>> Output from my compilers is as follows: >>>> laserson at hobbes:~$ gcc -v >>>> Using built-in specs. >>>> Target: i686-apple-darwin10 >>>> Configured with: /var/tmp/gcc/gcc-5664~89/src/configure >>>> --disable-checking --enable-werror --prefix=/usr --mandir=/share/man >>>> --enable-languages=c,objc,c++,obj-c++ >>>> --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib >>>> --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- >>>> --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 >>>> --with-gxx-include-dir=/include/c++/4.2.1 >>>> Thread model: posix >>>> gcc version 4.2.1 (Apple Inc. build 5664) >>>> >>>> laserson at hobbes:~$ g++ -v >>>> Using built-in specs. >>>> Target: i686-apple-darwin10 >>>> Configured with: /var/tmp/gcc/gcc-5664~89/src/configure >>>> --disable-checking --enable-werror --prefix=/usr --mandir=/share/man >>>> --enable-languages=c,objc,c++,obj-c++ >>>> --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib >>>> --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- >>>> --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 >>>> --with-gxx-include-dir=/include/c++/4.2.1 >>>> Thread model: posix >>>> gcc version 4.2.1 (Apple Inc. build 5664) >>>> >>>> laserson at hobbes:~$ gfortran -v >>>> Using built-in specs. >>>> Target: i686-apple-darwin8 >>>> Configured with: /Builds/unix/gcc/gcc-4.2/configure --prefix=/usr/local >>>> --mandir=/share/man --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ >>>> --build=i686-apple-darwin8 --host=i686-apple-darwin8 >>>> --target=i686-apple-darwin8 --enable-languages=fortran >>>> Thread model: posix >>>> gcc version 4.2.3 >>>> >>>> I installed numpy and scipy loosely based on the directions here: >>>> http://mail.scipy.org/pipermail/numpy-discussion/2010-August/052227.html >>>> >>>> I see that unfortunately that mail did not receive any response, I don't >>> remember seeing it. The info in it is not entirely correct. >>> >>> >>>> More specifically, after installing gfortran, I downloaded the following >>>> versions of numpy and scipy: >>>> numpy 1.5.1 >>>> scipy 0.8.0 >>>> >>> >>> scipy 0.8.0 has one issue with python 2.7. You should either use the >>> 0.8.x svn branch or 0.9.0b1 from svn or Sourceforge. >>> >>>> >>>> I then set the following environment variables: >>>> export MACOSX_DEPLOYMENT_TARGET=10.6 >>>> >>> This is OK. >>> >>> >>>> export CFLAGS="-arch i386 -arch x86_64" >>>> export FFLAGS="-m32 -m64" >>>> export LDFLAGS="-Wall -undefined dynamic_lookup -bundle -arch i386 -arch >>>> x86_64 -framework Accelerate" >>>> >>> >>> This is incorrect. When you build with distutils, CFLAGS/FFLAGS will >>> overwrite all flags, not append them. You should just leave this out, >>> default flags work fine with numpy 1.5.1 + scipy as specified above. >>> >>> >>>> >>>> Then I built and installed numpy as follows (note: sudo is not needed, >>>> as I took ownership of /usr/local): >>>> python setup.py build --fcompiler=gnu95 >>>> python setup.py install >>>> >>>> The results of numpy.test() are: >>>> >>> numpy.test() >>>> >>> >>> Looks OK. As a sanity check, run numpy.test('full'). There are some extra >>> distutils tests that get run like that. >>> >>> >>>> Running unit tests for numpy >>>> NumPy version 1.5.1 >>>> NumPy is installed in >>>> /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/numpy >>>> Python version 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 >>>> (Apple Inc. build 5664)] >>>> nose version 0.11.4 >>>> .... >>>> Ran 3006 tests in 18.312s >>>> >>>> OK (KNOWNFAIL=4, SKIP=1) >>>> >>>> >>>> Then I installed scipy as follows: >>>> python setup.py build --fcompiler=gnu95 >>>> python setup.py install >>>> >>>> and ran the tests, giving output: >>>> >>> scipy.test() >>>> Running unit tests for scipy >>>> NumPy version 1.5.1 >>>> NumPy is installed in >>>> /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/numpy >>>> SciPy version 0.8.0 >>>> SciPy is installed in >>>> /usr/local/Cellar/python/2.7.1/lib/python2.7/site-packages/scipy >>>> Python version 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC 4.2.1 >>>> (Apple Inc. build 5664)] >>>> nose version 0.11.4 >>>> RuntimeError: module compiled against ABI version 2000000 but this >>>> version of numpy is 1000009 >>>> .... >>>> Ran 4405 tests in 87.505s >>>> >>>> OK (KNOWNFAIL=19, SKIP=28) >>>> >>>> >>>> Note the RuntimeError listed above: >>>> RuntimeError: module compiled against ABI version 2000000 but this >>>> version of numpy is 1000009 >>>> >>> >>> That looks like you have some other numpy install hanging around. >>> >>> >>>> >>>> I can still import scipy fine. However, I then have a problem after >>>> building matplotlib (GitHub). I can build it fine: >>>> >>>> ============================================================================ >>>> BUILDING MATPLOTLIB >>>> matplotlib: 1.0.0 >>>> python: 2.7.1 (r271:86832, Dec 7 2010, 12:37:47) [GCC >>>> 4.2.1 (Apple Inc. build 5664)] >>>> platform: darwin >>>> >>>> REQUIRED DEPENDENCIES >>>> numpy: 1.5.1 >>>> freetype2: 12.0.6 >>>> >>>> OPTIONAL BACKEND DEPENDENCIES >>>> libpng: 1.2.44 >>>> Tkinter: Tkinter: 81008, Tk: 8.5, Tcl: 8.5 >>>> Gtk+: no >>>> * Building for Gtk+ requires pygtk; you must be >>>> able >>>> * to "import gtk" in your build/install >>>> environment >>>> Mac OS X native: yes >>>> Qt: no >>>> Qt4: no >>>> Cairo: no >>>> >>>> However, when I import matplotlib.pyplot, I get: >>>> >>> import matplotlib.pyplot >>>> RuntimeError: module compiled against ABI version 2000000 but this >>>> version of numpy is 1000009 >>>> Traceback (most recent call last): >>>> File "", line 1, in >>>> File >>>> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/pyplot.py", >>>> line 23, in >>>> from matplotlib.figure import Figure, figaspect >>>> File >>>> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/figure.py", >>>> line 16, in >>>> import artist >>>> File >>>> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/artist.py", >>>> line 6, in >>>> from transforms import Bbox, IdentityTransform, TransformedBbox, >>>> TransformedPath >>>> File >>>> "/Users/laserson/matplotlib/lib/python2.7/site-packages/matplotlib/transforms.py", >>>> line 34, in >>>> from matplotlib._path import affine_transform >>>> ImportError: numpy.core.multiarray failed to import >>>> >>>> >>>> However, when I separately try to import numpy.core.multiarray, I have >>>> no problem. >>>> >>>> Same problem as with scipy I guess. >>> >>> Cheers, >>> Ralf >>> >>> >>> Any ideas? >>>> >>>> Thanks! >>>> Uri >>>> >>>> >>>> >>>> >>>> On Wed, Dec 8, 2010 at 06:18, Ralf Gommers >>> > wrote: >>>> >>>>> >>>>> >>>>> On Wed, Dec 8, 2010 at 4:06 AM, Uri Laserson wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I am on a MacMini with Intel processor. I just installed OS X 10.6 >>>>>> and the latest Xcode that I could download, which included gcc 4.2. I am >>>>>> using python 2.7 built from source using homebrew. I installed the gfortran >>>>>> 4.2.3 binaries from http://r.research.att.com/tools/. >>>>>> >>>>>> I am trying to install numpy and scipy. numpy installs fine with or >>>>>> without switching to g++-4.0. I have successfully installed it using pip >>>>>> and also directly from source from the git repository. >>>>>> >>>>>> Scipy is giving me errors on install (the same errors whether I use >>>>>> pip or try the svn repository). I installed it successfully yesterday on a >>>>>> new Macbook Air using pip, after changing the symlinks to point to g++-4.0. >>>>>> However, today on my MacMini, I am getting errors after following the same >>>>>> protocol. >>>>>> >>>>>> The errors I am getting are here: >>>>>> https://gist.github.com/732293 >>>>>> >>>>> >>>>> The error indicates that 32 and 64 bit binaries are being mixed. Can >>>>> you tell us the following: >>>>> - what build command you used >>>>> - what Python you are using (from python.org, from Apple, >>>>> self-compiled?) >>>>> - the output of "gcc -v", "g++ -v" and "gfortran -v" >>>>> >>>>> Ralf >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Tue Dec 14 11:25:35 2010 From: vanforeest at gmail.com (nicky van foreest) Date: Tue, 14 Dec 2010 17:25:35 +0100 Subject: [SciPy-User] Python in middle school In-Reply-To: <74ED4EF4-6A2D-4D66-9CFD-34622DD3DFA8@diablotech.com> References: <4D06A323.6060203@gmail.com> <5BEBDDF4-B842-4F2B-881C-C5ED2A250B42@diablotech.com> <74ED4EF4-6A2D-4D66-9CFD-34622DD3DFA8@diablotech.com> Message-ID: Hi, My son (11) reads Hello World! Computer Programming for Kids and Other Beginners [Paperback] Warren Sande Warren Sande (Author) He likes it, that is, most of it. The book ends with programming a game. bye Nicky On 14 December 2010 05:20, Robert Ferrell wrote: > > On Dec 13, 2010, at 8:03 PM, josef.pktd at gmail.com wrote: > >> On Mon, Dec 13, 2010 at 9:26 PM, Robert Ferrell wrote: >>> That looks like a great suggestion. ?I'll look into it a bit more. >>> >>> thanks, >>> -robert >>> >>> On Dec 13, 2010, at 3:50 PM, Alan G Isaac wrote: >>> >>>> On 12/13/2010 5:33 PM, Robert Ferrell wrote: >>>>> I may have an opportunity to help a few middle school students learn python. >>>>> >>>>> Anybody have suggestions for fun, simple projects that involve tools in numpy/scipy/EPD that might be appropriate for (smart) 7th&/or 8th graders? >>>>> >>>> >>>> >>>> Must it go beyond core Python? >>>> If so, might PyGame be enough? >>>> http://inventwithpython.com/ >> >> Also, I found this story interesting >> http://learnpython.wordpress.com/2010/12/12/python-in-the-news/ >> (At a grade school of one of my kids they use Lego Mindstorm, which sounds fun.) > > Another article in that guys blog, http://learnpython.wordpress.com/category/teaching/, also has some good info. > > thanks, > -robert >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From bsouthey at gmail.com Tue Dec 14 11:39:27 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 14 Dec 2010 10:39:27 -0600 Subject: [SciPy-User] Python in middle school In-Reply-To: References: <4D06A323.6060203@gmail.com> <5BEBDDF4-B842-4F2B-881C-C5ED2A250B42@diablotech.com> <74ED4EF4-6A2D-4D66-9CFD-34622DD3DFA8@diablotech.com> Message-ID: <4D079DBF.6070208@gmail.com> On 12/14/2010 10:25 AM, nicky van foreest wrote: > Hi, > > My son (11) reads > > > Hello World! Computer Programming for Kids and Other Beginners [Paperback] > Warren Sande > Warren Sande (Author) > > He likes it, that is, most of it. The book ends with programming a game. > > bye > > Nicky > > > > On 14 December 2010 05:20, Robert Ferrell wrote: >> On Dec 13, 2010, at 8:03 PM, josef.pktd at gmail.com wrote: >> >>> On Mon, Dec 13, 2010 at 9:26 PM, Robert Ferrell wrote: >>>> That looks like a great suggestion. I'll look into it a bit more. >>>> >>>> thanks, >>>> -robert >>>> >>>> On Dec 13, 2010, at 3:50 PM, Alan G Isaac wrote: >>>> >>>>> On 12/13/2010 5:33 PM, Robert Ferrell wrote: >>>>>> I may have an opportunity to help a few middle school students learn python. >>>>>> >>>>>> Anybody have suggestions for fun, simple projects that involve tools in numpy/scipy/EPD that might be appropriate for (smart) 7th&/or 8th graders? >>>>>> >>>>> >>>>> Must it go beyond core Python? >>>>> If so, might PyGame be enough? >>>>> http://inventwithpython.com/ >>> Also, I found this story interesting >>> http://learnpython.wordpress.com/2010/12/12/python-in-the-news/ >>> (At a grade school of one of my kids they use Lego Mindstorm, which sounds fun.) >> Another article in that guys blog, http://learnpython.wordpress.com/category/teaching/, also has some good info. >> >> thanks, >> -robert Linux Format magazine (http://www.linuxformat.com/) has done various Python tutorials in the past but you need to be a subscriber to get most of these. The editors might be willing to provide these on request. There is this one: Issue 112 - Make a racing game - Expand your Python and PyGame skills with a top-down racer in under 100 lines of code. Vroom! (Mike Saunders): http://www.linuxformat.com/includes/download.php?PDF=LXF112.tut_code.pdf Bruce From josef.pktd at gmail.com Tue Dec 14 12:42:40 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Dec 2010 12:42:40 -0500 Subject: [SciPy-User] understanding machine precision Message-ID: I thought that we get deterministic results, with identical machine precision errors, but I get (with some random a0, b0) >>> for i in range(5): x = scipy.linalg.lstsq(a0,b0)[0] x2 = scipy.linalg.lstsq(a0,b0)[0] print np.max(np.abs(x-x2)) 9.99200722163e-016 9.99200722163e-016 0.0 0.0 9.99200722163e-016 >>> a0.shape (100, 10) >>> b0.shape (100, 3) Why is the result not always the same? just curious Josef From faltet at pytables.org Tue Dec 14 13:09:03 2010 From: faltet at pytables.org (Francesc Alted) Date: Tue, 14 Dec 2010 19:09:03 +0100 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: Message-ID: <201012141909.03925.faltet@pytables.org> A Tuesday 14 December 2010 18:42:40 josef.pktd at gmail.com escrigu?: > I thought that we get deterministic results, with identical machine > precision errors, but I get (with some random a0, b0) > > >>> for i in range(5): > x = scipy.linalg.lstsq(a0,b0)[0] > x2 = scipy.linalg.lstsq(a0,b0)[0] > print np.max(np.abs(x-x2)) > > > 9.99200722163e-016 > 9.99200722163e-016 > 0.0 > 0.0 > 9.99200722163e-016 > > >>> a0.shape > > (100, 10) > > >>> b0.shape > > (100, 3) > > Why is the result not always the same? just curious That's really funny! Could you please come up with a self-contained example so as to see if others can reproduce that? -- Francesc Alted From josef.pktd at gmail.com Tue Dec 14 13:38:12 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Dec 2010 13:38:12 -0500 Subject: [SciPy-User] understanding machine precision In-Reply-To: <201012141909.03925.faltet@pytables.org> References: <201012141909.03925.faltet@pytables.org> Message-ID: On Tue, Dec 14, 2010 at 1:09 PM, Francesc Alted wrote: > A Tuesday 14 December 2010 18:42:40 josef.pktd at gmail.com escrigu?: >> I thought that we get deterministic results, with identical machine >> precision errors, but I get (with some random a0, b0) >> >> >>> for i in range(5): >> ? ? ? x = scipy.linalg.lstsq(a0,b0)[0] >> ? ? ? x2 = scipy.linalg.lstsq(a0,b0)[0] >> ? ? ? print np.max(np.abs(x-x2)) >> >> >> 9.99200722163e-016 >> 9.99200722163e-016 >> 0.0 >> 0.0 >> 9.99200722163e-016 >> >> >>> a0.shape >> >> (100, 10) >> >> >>> b0.shape >> >> (100, 3) >> >> Why is the result not always the same? just curious > > That's really funny! ?Could you please come up with a self-contained > example so as to see if others can reproduce that? Essentially all I did was a0 = np.random.randn(100,10) b0 = a0.sum(1)[:,None] + np.random.randn(100,3) I copied scipy.linalg pinv, pinv2 and lstsq to a local module, and the results where not exactly the same as with the ones in scipy. So, I did this check for scipy also. the attached script produces different results in each run (on WindowsXP 32), for example lstsq 1.55431223448e-015 1.55431223448e-015 0.0 1.55431223448e-015 1.55431223448e-015 pinv 5.20417042793e-017 5.20417042793e-017 0.0 0.0 0.0 pinv2 0.0 0.0 5.76795555762e-017 5.76795555762e-017 4.51028103754e-017 Thanks, Josef > > -- > Francesc Alted > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- import numpy as np import scipy.linalg np.random.seed(12345) a0 = np.random.randn(100,10) b0 = a0.sum(1)[:,None] + np.random.randn(100,3) print '\nlstsq' for i in range(5): x = scipy.linalg.lstsq(a0,b0)[0] x2 = scipy.linalg.lstsq(a0,b0)[0] print np.max(np.abs(x-x2)) print '\npinv' for i in range(5): x = scipy.linalg.pinv(a0) x2 = scipy.linalg.pinv(a0) print np.max(np.abs(x-x2)) print '\npinv2' for i in range(5): x = scipy.linalg.pinv2(a0) x2 = scipy.linalg.pinv2(a0) print np.max(np.abs(x-x2)) From kwgoodman at gmail.com Tue Dec 14 13:42:23 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 14 Dec 2010 10:42:23 -0800 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: Message-ID: On Tue, Dec 14, 2010 at 9:42 AM, wrote: > I thought that we get deterministic results, with identical machine > precision errors, but I get (with some random a0, b0) > >>>> for i in range(5): > ? ? ? ?x = scipy.linalg.lstsq(a0,b0)[0] > ? ? ? ?x2 = scipy.linalg.lstsq(a0,b0)[0] > ? ? ? ?print np.max(np.abs(x-x2)) > > > 9.99200722163e-016 > 9.99200722163e-016 > 0.0 > 0.0 > 9.99200722163e-016 I've started a couple of threads in the past on repeatability. Most of the discussion ends up being about ATLAS. I suggest repeating the test without ATLAS. From robert.kern at gmail.com Tue Dec 14 13:47:04 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 14 Dec 2010 12:47:04 -0600 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: Message-ID: On Tue, Dec 14, 2010 at 12:42, Keith Goodman wrote: > On Tue, Dec 14, 2010 at 9:42 AM, ? wrote: >> I thought that we get deterministic results, with identical machine >> precision errors, but I get (with some random a0, b0) >> >>>>> for i in range(5): >> ? ? ? ?x = scipy.linalg.lstsq(a0,b0)[0] >> ? ? ? ?x2 = scipy.linalg.lstsq(a0,b0)[0] >> ? ? ? ?print np.max(np.abs(x-x2)) >> >> >> 9.99200722163e-016 >> 9.99200722163e-016 >> 0.0 >> 0.0 >> 9.99200722163e-016 > > I've started a couple of threads in the past on repeatability. Most of > the discussion ends up being about ATLAS. I suggest repeating the test > without ATLAS. On OS X with numpy linked against the builtin Accelerate.framework (which is based off of ATLAS), I get the same result every time. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From josef.pktd at gmail.com Tue Dec 14 13:57:43 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Dec 2010 13:57:43 -0500 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: Message-ID: On Tue, Dec 14, 2010 at 1:47 PM, Robert Kern wrote: > On Tue, Dec 14, 2010 at 12:42, Keith Goodman wrote: >> On Tue, Dec 14, 2010 at 9:42 AM, ? wrote: >>> I thought that we get deterministic results, with identical machine >>> precision errors, but I get (with some random a0, b0) >>> >>>>>> for i in range(5): >>> ? ? ? ?x = scipy.linalg.lstsq(a0,b0)[0] >>> ? ? ? ?x2 = scipy.linalg.lstsq(a0,b0)[0] >>> ? ? ? ?print np.max(np.abs(x-x2)) >>> >>> >>> 9.99200722163e-016 >>> 9.99200722163e-016 >>> 0.0 >>> 0.0 >>> 9.99200722163e-016 >> >> I've started a couple of threads in the past on repeatability. Most of >> the discussion ends up being about ATLAS. I suggest repeating the test >> without ATLAS. Is there a way to turn ATLAS off without recompiling? > > On OS X with numpy linked against the builtin Accelerate.framework > (which is based off of ATLAS), I get the same result every time. When I run the script on the commandline (with a new python each time), I get the same results each time, but within the loop the results still differ up to 1.55431223448e-015. On IDLE when I remain in the same session, results differ with each run. An explanation that ATLAS has some builtin state is enough for me, I try not to rely on numerical precision in this range. Josef > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jsseabold at gmail.com Tue Dec 14 14:05:30 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 14 Dec 2010 14:05:30 -0500 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: Message-ID: On Tue, Dec 14, 2010 at 1:47 PM, Robert Kern wrote: > On Tue, Dec 14, 2010 at 12:42, Keith Goodman wrote: >> On Tue, Dec 14, 2010 at 9:42 AM, ? wrote: >>> I thought that we get deterministic results, with identical machine >>> precision errors, but I get (with some random a0, b0) >>> >>>>>> for i in range(5): >>> ? ? ? ?x = scipy.linalg.lstsq(a0,b0)[0] >>> ? ? ? ?x2 = scipy.linalg.lstsq(a0,b0)[0] >>> ? ? ? ?print np.max(np.abs(x-x2)) >>> >>> >>> 9.99200722163e-016 >>> 9.99200722163e-016 >>> 0.0 >>> 0.0 >>> 9.99200722163e-016 >> >> I've started a couple of threads in the past on repeatability. Most of >> the discussion ends up being about ATLAS. I suggest repeating the test >> without ATLAS. > > On OS X with numpy linked against the builtin Accelerate.framework > (which is based off of ATLAS), I get the same result every time. > > Windows 7, numpy 64 bit mkl binaries from Christoph, I get 0.0 every time. Using 32 bit mkl binaries in IPython interpreter and from the command line I do not get reproducible results. If I do astype(float32) I seem to get 0.0 most of the times but more infrequently get something like 3e-8. Skipper From robert.kern at gmail.com Tue Dec 14 14:07:39 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 14 Dec 2010 13:07:39 -0600 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: Message-ID: On Tue, Dec 14, 2010 at 12:57, wrote: > On Tue, Dec 14, 2010 at 1:47 PM, Robert Kern wrote: >> On Tue, Dec 14, 2010 at 12:42, Keith Goodman wrote: >>> On Tue, Dec 14, 2010 at 9:42 AM, ? wrote: >>>> I thought that we get deterministic results, with identical machine >>>> precision errors, but I get (with some random a0, b0) >>>> >>>>>>> for i in range(5): >>>> ? ? ? ?x = scipy.linalg.lstsq(a0,b0)[0] >>>> ? ? ? ?x2 = scipy.linalg.lstsq(a0,b0)[0] >>>> ? ? ? ?print np.max(np.abs(x-x2)) >>>> >>>> >>>> 9.99200722163e-016 >>>> 9.99200722163e-016 >>>> 0.0 >>>> 0.0 >>>> 9.99200722163e-016 >>> >>> I've started a couple of threads in the past on repeatability. Most of >>> the discussion ends up being about ATLAS. I suggest repeating the test >>> without ATLAS. > > Is there a way to turn ATLAS off without recompiling? No. >> On OS X with numpy linked against the builtin Accelerate.framework >> (which is based off of ATLAS), I get the same result every time. > > When I run the script on the commandline (with a new python each > time), I get the same results each time, but within the loop the > results still differ up to 1.55431223448e-015. On IDLE when I remain > in the same session, results differ with each run. I mean that I get "0.0" for each iteration of each loop even if I push the number of iterations up to 500 or so. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From faltet at pytables.org Tue Dec 14 14:11:42 2010 From: faltet at pytables.org (Francesc Alted) Date: Tue, 14 Dec 2010 20:11:42 +0100 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: <201012141909.03925.faltet@pytables.org> Message-ID: <201012142011.42107.faltet@pytables.org> A Tuesday 14 December 2010 19:38:12 josef.pktd at gmail.com escrigu?: > On Tue, Dec 14, 2010 at 1:09 PM, Francesc Alted wrote: > > A Tuesday 14 December 2010 18:42:40 josef.pktd at gmail.com escrigu?: > >> I thought that we get deterministic results, with identical > >> machine precision errors, but I get (with some random a0, b0) > >> > >> >>> for i in range(5): > >> x = scipy.linalg.lstsq(a0,b0)[0] > >> x2 = scipy.linalg.lstsq(a0,b0)[0] > >> print np.max(np.abs(x-x2)) > >> > >> > >> 9.99200722163e-016 > >> 9.99200722163e-016 > >> 0.0 > >> 0.0 > >> 9.99200722163e-016 > >> > >> >>> a0.shape > >> > >> (100, 10) > >> > >> >>> b0.shape > >> > >> (100, 3) > >> > >> Why is the result not always the same? just curious > > > > That's really funny! Could you please come up with a > > self-contained example so as to see if others can reproduce that? > > Essentially all I did was > > a0 = np.random.randn(100,10) > b0 = a0.sum(1)[:,None] + np.random.randn(100,3) > > I copied scipy.linalg pinv, pinv2 and lstsq to a local module, and > the results where not exactly the same as with the ones in scipy. > So, I did this check for scipy also. > > the attached script produces different results in each run (on > WindowsXP 32), for example > > lstsq > 1.55431223448e-015 > 1.55431223448e-015 > 0.0 > 1.55431223448e-015 > 1.55431223448e-015 > > pinv > 5.20417042793e-017 > 5.20417042793e-017 > 0.0 > 0.0 > 0.0 > > pinv2 > 0.0 > 0.0 > 5.76795555762e-017 > 5.76795555762e-017 > 4.51028103754e-017 I cannot reproduce that: lstsq 0.0 0.0 0.0 0.0 0.0 pinv 0.0 0.0 0.0 0.0 0.0 Using numpy without ATLAS. So yes, as others pointed out it seems to be ATLAS the responsible for this. I think ATLAS always perform some small calculation in order to determine the optimal block sizes for its computations. So, my guess is that, depending on the stress of your machine (and maybe on the phase of the moon too), these block sizes maybe slightly different, leading to a different order in computations and, hence, preventing strict reproducibility. Anyway, I must confess that running these calculations for *every* computation strikes me a bit, but provided that the matrices can be large, it might have some sense (i.e. the cost for block size guessing is negligible in general). -- Francesc Alted From nwagner at iam.uni-stuttgart.de Tue Dec 14 14:17:10 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 14 Dec 2010 20:17:10 +0100 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: <201012141909.03925.faltet@pytables.org> Message-ID: Hi all, I am using ATLAS python -i try_deterministic.py lstsq 0.0 0.0 0.0 0.0 0.0 pinv 0.0 0.0 0.0 0.0 0.0 pinv2 0.0 0.0 0.0 0.0 0.0 >>> from numpy import show_config >>> show_config() atlas_threads_info: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/nwagner/src/ATLAS3.8.2/mybuild/lib'] define_macros = [('ATLAS_INFO', '"\\"3.8.2\\""')] language = f77 include_dirs = ['/home/nwagner/src/ATLAS3.8.2/include'] blas_opt_info: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/nwagner/src/ATLAS3.8.2/mybuild/lib'] define_macros = [('ATLAS_INFO', '"\\"3.8.2\\""')] language = c include_dirs = ['/home/nwagner/src/ATLAS3.8.2/include'] atlas_blas_threads_info: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/nwagner/src/ATLAS3.8.2/mybuild/lib'] define_macros = [('ATLAS_INFO', '"\\"3.8.2\\""')] language = c include_dirs = ['/home/nwagner/src/ATLAS3.8.2/include'] lapack_opt_info: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/nwagner/src/ATLAS3.8.2/mybuild/lib'] define_macros = [('ATLAS_INFO', '"\\"3.8.2\\""')] language = f77 include_dirs = ['/home/nwagner/src/ATLAS3.8.2/include'] lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE mkl_info: NOT AVAILABLE Nils From faltet at pytables.org Tue Dec 14 14:29:17 2010 From: faltet at pytables.org (Francesc Alted) Date: Tue, 14 Dec 2010 20:29:17 +0100 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: Message-ID: <201012142029.18007.faltet@pytables.org> A Tuesday 14 December 2010 20:17:10 Nils Wagner escrigu?: > Hi all, > > I am using ATLAS > > python -i try_deterministic.py > > lstsq > 0.0 > 0.0 > 0.0 > 0.0 > 0.0 That's interesting. Maybe Josef is using a threaded ATLAS? I positively know that threading introduces variability in the order that the computations are done. However, I'm not sure on why ATLAS has decided to use several threads for so small matrices ((100, 10)) :-/ -- Francesc Alted From eric at depagne.org Tue Dec 14 14:48:20 2010 From: eric at depagne.org (=?iso-8859-1?q?=C9ric_Depagne?=) Date: Tue, 14 Dec 2010 20:48:20 +0100 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: Message-ID: <201012142048.21372.eric@depagne.org> hi all. I do not know if I'm using ATLAS, but I get the following: python try_deterministic.py lstsq 0.0 0.0 0.0 0.0 0.0 pinv 0.0 0.0 0.0 0.0 0.0 pinv2 0.0 0.0 0.0 0.0 0.0 And some config: In [3]: show_config() blas_info: libraries = ['blas'] library_dirs = ['/usr/lib64'] language = f77 lapack_info: libraries = ['lapack'] library_dirs = ['/usr/lib64'] language = f77 atlas_threads_info: NOT AVAILABLE blas_opt_info: libraries = ['blas'] library_dirs = ['/usr/lib64'] language = f77 define_macros = [('NO_ATLAS_INFO', 1)] atlas_blas_threads_info: NOT AVAILABLE lapack_opt_info: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib64'] language = f77 define_macros = [('NO_ATLAS_INFO', 1)] atlas_info: NOT AVAILABLE lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE atlas_blas_info: NOT AVAILABLE mkl_info: NOT AVAILABLE -- Un clavier azerty en vaut deux ---------------------------------------------------------- ?ric Depagne eric at depagne.org From bsouthey at gmail.com Tue Dec 14 14:59:43 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 14 Dec 2010 13:59:43 -0600 Subject: [SciPy-User] understanding machine precision In-Reply-To: <201012142029.18007.faltet@pytables.org> References: <201012142029.18007.faltet@pytables.org> Message-ID: <4D07CCAF.6090706@gmail.com> On 12/14/2010 01:29 PM, Francesc Alted wrote: > A Tuesday 14 December 2010 20:17:10 Nils Wagner escrigu?: >> Hi all, >> >> I am using ATLAS >> >> python -i try_deterministic.py >> >> lstsq >> 0.0 >> 0.0 >> 0.0 >> 0.0 >> 0.0 > That's interesting. Maybe Josef is using a threaded ATLAS? I > positively know that threading introduces variability in the order that > the computations are done. However, I'm not sure on why ATLAS has > decided to use several threads for so small matrices ((100, 10)) :-/ > Does this 'issue' occur with numpy's lstsq? This is most probably due to the OS (Windows) and compiler as Skipper's post indicates and probably related to the cpu as well. It may be how the binaries are created relative to the target system especially if different compilers are used. Overclocking and heat can also create issues (just as those people finding prime numbers :-) ). Bruce From silva at lma.cnrs-mrs.fr Tue Dec 14 15:08:34 2010 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Tue, 14 Dec 2010 17:08:34 -0300 Subject: [SciPy-User] understanding machine precision In-Reply-To: <201012142029.18007.faltet@pytables.org> References: <201012142029.18007.faltet@pytables.org> Message-ID: <1292357314.1866.13.camel@florian-desktop> What I get: $ python -i try_deterministic.py lstsq 0.0 0.0 0.0 0.0 0.0 pinv 0.0 0.0 0.0 0.0 0.0 pinv2 0.0 0.0 0.0 0.0 0.0 >>> np.show_config() blas_info: libraries = ['blas'] library_dirs = ['/usr/lib64'] language = f77 lapack_info: libraries = ['lapack'] library_dirs = ['/usr/lib64'] language = f77 atlas_threads_info: NOT AVAILABLE blas_opt_info: libraries = ['blas'] library_dirs = ['/usr/lib64'] language = f77 define_macros = [('NO_ATLAS_INFO', 1)] atlas_blas_threads_info: NOT AVAILABLE lapack_opt_info: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib64'] language = f77 define_macros = [('NO_ATLAS_INFO', 1)] atlas_info: NOT AVAILABLE lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE atlas_blas_info: NOT AVAILABLE mkl_info: NOT AVAILABLE >>> np.__version__ '1.3.0' on Ubuntu and x86_64 -- Fabrice Silva From josef.pktd at gmail.com Tue Dec 14 14:38:18 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Dec 2010 14:38:18 -0500 Subject: [SciPy-User] understanding machine precision In-Reply-To: <201012142029.18007.faltet@pytables.org> References: <201012142029.18007.faltet@pytables.org> Message-ID: On Tue, Dec 14, 2010 at 2:29 PM, Francesc Alted wrote: > A Tuesday 14 December 2010 20:17:10 Nils Wagner escrigu?: >> Hi all, >> >> I am using ATLAS >> >> python -i try_deterministic.py >> >> lstsq >> 0.0 >> 0.0 >> 0.0 >> 0.0 >> 0.0 > > That's interesting. ?Maybe Josef is using a threaded ATLAS? ?I > positively know that threading introduces variability in the order that > the computations are done. ?However, I'm not sure on why ATLAS has > decided to use several threads for so small matrices ((100, 10)) :-/ No, I'm using an old plain ATLAS with a single CPU, but if Skipper is getting it also with 32bit mkl, then there might still be another Windows "feature" in play. Since my results are identical across python session, but not within, the results are still deterministic and cannot depend on how busy my computer is. Josef > > -- > Francesc Alted > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kwgoodman at gmail.com Tue Dec 14 15:17:25 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 14 Dec 2010 12:17:25 -0800 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: <201012142029.18007.faltet@pytables.org> Message-ID: On Tue, Dec 14, 2010 at 11:38 AM, wrote: > On Tue, Dec 14, 2010 at 2:29 PM, Francesc Alted wrote: >> A Tuesday 14 December 2010 20:17:10 Nils Wagner escrigu?: >>> Hi all, >>> >>> I am using ATLAS >>> >>> python -i try_deterministic.py >>> >>> lstsq >>> 0.0 >>> 0.0 >>> 0.0 >>> 0.0 >>> 0.0 >> >> That's interesting. ?Maybe Josef is using a threaded ATLAS? ?I >> positively know that threading introduces variability in the order that >> the computations are done. ?However, I'm not sure on why ATLAS has >> decided to use several threads for so small matrices ((100, 10)) :-/ > > No, I'm using an old plain ATLAS with a single CPU, but if Skipper is > getting it also with 32bit mkl, then there might still be another > Windows "feature" in play. > > Since my results are identical across python session, but not within, > the results are still deterministic and cannot depend on how busy my > computer is. Come to think if it, one of my tests is to compare the output of each new release of one of my packages with the previous release. My colleague runs the test on a windows machine. He gets a difference in output when there should be none. Even after pulling the installer apart and installing the non-ATLAS version he sees the difference on windows. It would be great to figure out what is going on. $ python -i try_deterministic.py lstsq 0.0 0.0 0.0 0.0 0.0 pinv 0.0 0.0 0.0 0.0 0.0 pinv2 0.0 0.0 0.0 0.0 0.0 $ uname -a Linux kg 2.6.32-26-generic #48-Ubuntu SMP Wed Nov 24 10:14:11 UTC 2010 x86_64 GNU/Linux >>> np.show_config() atlas_threads_info: NOT AVAILABLE blas_opt_info: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/lib'] define_macros = [('ATLAS_INFO', '"\\"3.8.3\\""')] language = c atlas_blas_threads_info: NOT AVAILABLE lapack_opt_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/lib'] define_macros = [('NO_ATLAS_INFO', 2)] language = f77 atlas_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/lib'] language = f77 lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE atlas_blas_info: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/lib'] language = c mkl_info: NOT AVAILABLE From vickram.l at gmail.com Wed Dec 15 01:14:47 2010 From: vickram.l at gmail.com (Vick) Date: Wed, 15 Dec 2010 01:14:47 -0500 Subject: [SciPy-User] Help With Scipy: Integration pack, returning an array for an array input Message-ID: <4D085CD7.5080104@gmail.com> I hope this is the right way to do this, and I apologize most sincerely if it's not: Here is a step by step of what I'm trying to do and what I typed in, it's a simplified version that gets the same error message as what I'm trying. In [1]: import numpy as np In [2]: import scipy.integrate as inte In [3]: C = lambda u: inte.quad(lambda s: np.exp(s),0,u) This yields: In [4]: C(5) Out[4]: (147.41315910257663, 1.6366148336841205e-12) i.e. The integral of exp(s) from 0 to5, andgivestheerror, whichisgreat. now define: In [5]: D = lambda t: C(t)[0] In [6]: D(5) Out[6]: 147.41315910257663 Just the first element, which is all I really care about, the error isn't what I need right now. Now try t = np.arange(0,5) to try and get it to work for a range of values from 0 to5. In [7]: t = np.arange(0,5) In [8]: D(t) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/vick/ in () /home/vick/ in (t) /home/vick/ in (u) /usr/lib/python2.6/dist-packages/scipy/integrate/quadpack.pyc in quad(func, a, b, args, full_output,epsabs, epsrel, limit, points, weight, wvar, wopts, maxp1, limlst) 183 if type(args) != type(()): args = (args,) 184 if (weight is None): --> 185 retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points) 186 else: 187 retval =_quad_weight(func,a,b,args,full_output,epsabs,epsrel,limlst,limit,maxp1,weight,wvar,wopts) /usr/lib/python2.6/dist-packages/scipy/integrate/quadpack.pyc in _quad(func, a, b, args, full_output,epsabs, epsrel, limit, points) 231 def _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points): 232 infbounds = 0 --> 233 if (b != Inf and a != -Inf): 234 pass # standard integration 235 elif (b == Inf and a != -Inf): ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() I'd like to get the value of this integral over a range of t's. I don't really understand why it's doing this, and I'd like to avoid using a for loop. From jason-sage at creativetrax.com Wed Dec 15 09:58:35 2010 From: jason-sage at creativetrax.com (Jason Grout) Date: Wed, 15 Dec 2010 08:58:35 -0600 Subject: [SciPy-User] scipy.org is down Message-ID: <4D08D79B.1000804@creativetrax.com> FYI, it seems that scipy.org is down. Here is the error message from firefox: Server not found Firefox can't find the server at www.scipy.org. Thanks, Jason From gnurser at gmail.com Wed Dec 15 05:44:02 2010 From: gnurser at gmail.com (George Nurser) Date: Wed, 15 Dec 2010 10:44:02 +0000 Subject: [SciPy-User] probs making docs Message-ID: Hi, Just been making up numpy/ipython/scipy/matplotlib/pytables/netCDF4/cython etc on OS X 10.6 to go with python installed from the new 2.7.1. 10.6 x86_64/i386 installer. Most things seem to work very straightforwardly -- numpy and scipy trunk seem in very good shape, and the new python installer makes everything much easier -- and numpy and matplotlib docs made OK, but I'm having a problem making up the scipy documentation. In the scipy/doc directory I initially tried make html It failed, looking for numpydoc. I therefore installed numpydoc. The output of make html is now mkdir -p build touch build/generate-stamp mkdir -p build/html build/doctrees LANG=C sphinx-build -b html -d build/doctrees source build/html Running Sphinx v1.0.5 Scipy (VERSION 0.10.dev) (RELEASE 0.10.0.dev) Extension error: Could not import extension plot_directive (exception: No module named plot_directive) make: *** [html] Error 1 This seems strange, given that plot_directive.{py,pyc} are in the installed numpydoc directory /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpydoc-0.4-py2.7.egg/numpydoc --George. From robert.kern at gmail.com Wed Dec 15 11:00:36 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 15 Dec 2010 10:00:36 -0600 Subject: [SciPy-User] scipy.org is down In-Reply-To: <4D08D79B.1000804@creativetrax.com> References: <4D08D79B.1000804@creativetrax.com> Message-ID: On Wed, Dec 15, 2010 at 08:58, Jason Grout wrote: > FYI, it seems that scipy.org is down. It's back up. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From tmp50 at ukr.net Wed Dec 15 10:24:20 2010 From: tmp50 at ukr.net (Dmitrey) Date: Wed, 15 Dec 2010 17:24:20 +0200 Subject: [SciPy-User] new quarterly OpenOpt/FuncDesigner release 0.32 In-Reply-To: <4D085CD7.5080104@gmail.com> References: <4D085CD7.5080104@gmail.com> Message-ID: Hi all, I'm glad to inform you about new quarterly OpenOpt/FuncDesigner release (0.32): OpenOpt: * New class: LCP (and related solver) * New QP solver: qlcp * New NLP solver: sqlcp * New large-scale NSP (nonsmooth) solver gsubg. Currently it still requires lots of improvements (especially for constraints - their handling is very premature yet and often fails), but since the solver sometimes already works better than ipopt, algencan and other competitors it was tried with, I decided to include the one into the release. * Now SOCP can handle Ax <= b constraints (and bugfix for handling lb <= x <= ub has been committed) * Some other fixes and improvements > FuncDesigner: * Add new function removeAttachedConstraints * Add new oofuns min and max (their capabilities are quite restricted yet) * Systems of nonlinear equations: possibility to assign personal tolerance for an equation * Some fixes and improvements > > For more details see our forum entry > http://forum.openopt.org/viewtopic.php?id=325 > > Regards, D. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Wed Dec 15 11:05:37 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 15 Dec 2010 11:05:37 -0500 Subject: [SciPy-User] probs making docs In-Reply-To: References: Message-ID: On Wed, Dec 15, 2010 at 5:44 AM, George Nurser wrote: > Hi, > Just been making up > numpy/ipython/scipy/matplotlib/pytables/netCDF4/cython etc on OS X > 10.6 to go with python installed from the new 2.7.1. 10.6 x86_64/i386 > installer. > > Most things seem to work very straightforwardly -- numpy and scipy > trunk seem in very good shape, and the new python installer makes > everything much easier -- and numpy and matplotlib docs made OK, but > I'm having a problem making up the scipy documentation. > > In the scipy/doc directory I initially tried > make html > It failed, looking for numpydoc. > I therefore installed numpydoc. > > The output of make html is now > > mkdir -p build > touch build/generate-stamp > mkdir -p build/html build/doctrees > LANG=C sphinx-build -b html -d build/doctrees ? source build/html > Running Sphinx v1.0.5 > Scipy (VERSION 0.10.dev) (RELEASE 0.10.0.dev) > > Extension error: > Could not import extension plot_directive (exception: No module named > plot_directive) > make: *** [html] Error 1 > > This seems strange, given that plot_directive.{py,pyc} are in the > installed numpydoc directory > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpydoc-0.4-py2.7.egg/numpydoc > What numpy version? This is an incompatibility with new sphinx (post 1.0 betas) and old numpy sphinx extensions. You can either downgrade sphinx or upgrade numpy. http://projects.scipy.org/numpy/ticket/1489 I think you could really just drop in the newer numpy/doc/sphinxext/* and it should work, but maybe not... https://github.com/numpy/numpy/tree/master/doc/sphinxext Skipper From Pierre.RAYBAUT at CEA.FR Wed Dec 15 11:35:42 2010 From: Pierre.RAYBAUT at CEA.FR (Pierre.RAYBAUT at CEA.FR) Date: Wed, 15 Dec 2010 17:35:42 +0100 Subject: [SciPy-User] [ANN] guidata v1.2.5 Message-ID: Hi all, I am pleased to announce that `guidata` v1.2.5 has been released. Note that the project has recently been moved to GoogleCode: http://guidata.googlecode.com This version of `guidata` includes a brand new documentation with examples, API reference, etc.: http://packages.python.org/guidata/ Based on the Qt Python binding module PyQt4, guidata is a Python library generating graphical user interfaces for easy dataset editing and display. It also provides helpers and application development tools for PyQt4. guidata also provides the following features: * guidata.qthelpers: PyQt4 helpers * guidata.disthelpers: py2exe helpers * guidata.userconfig: .ini configuration management helpers (based on Python standard module ConfigParser) * guidata.configtools: library/application data management * guidata.gettext_helpers: translation helpers (based on the GNU tool gettext) * guidata.guitest: automatic GUI-based test launcher * guidata.utils: miscelleneous utilities guidata has been successfully tested on GNU/Linux and Windows platforms. Python package index page: http://pypi.python.org/pypi/guidata/ Documentation, screenshots: http://packages.python.org/guidata/ Downloads (source + Python(x,y) plugin): http://guidata.googlecode.com Cheers, Pierre --- Dr. Pierre Raybaut CEA - Commissariat ? l'Energie Atomique et aux Energies Alternatives From Pierre.RAYBAUT at CEA.FR Wed Dec 15 11:35:51 2010 From: Pierre.RAYBAUT at CEA.FR (Pierre.RAYBAUT at CEA.FR) Date: Wed, 15 Dec 2010 17:35:51 +0100 Subject: [SciPy-User] [ANN] guiqwt v2.0.8 Message-ID: Hi all, I am pleased to announce that `guiqwt` v2.0.8 has been released. Note that the project has recently been moved to GoogleCode: http://guiqwt.googlecode.com This version of `guiqwt` includes a brand new documentation with examples, API reference, etc.: http://packages.python.org/guiqwt/ Based on PyQwt (plotting widgets for PyQt4 graphical user interfaces) and on the scientific modules NumPy and SciPy, guiqwt is a Python library providing efficient 2D data-plotting features (curve/image visualization and related tools) for interactive computing and signal/image processing application development. When compared to the excellent module `matplotlib`, the main advantage of `guiqwt` is performance: see http://packages.python.org/guiqwt/overview.html#performances. But `guiqwt` is more than a plotting library; it also provides: * Helper functions for data processing: see the example http://packages.python.org/guiqwt/examples.html#curve-fitting * Framework for signal/image processing application development: see http://packages.python.org/guiqwt/examples.html * And many other features like making executable Windows programs easily (py2exe helpers): see http://packages.python.org/guiqwt/disthelpers.html guiqwt plotting features are the following: guiqwt.pyplot: equivalent to matplotlib's pyplot module (pylab) supported plot items: * curves, error bar curves and 1-D histograms * images (RGB images are not supported), images with non-linear x/y scales, images with specified pixel size (e.g. loaded from DICOM files), 2-D histograms, pseudo-color images (pcolor) * labels, curve plot legends * shapes: polygon, polylines, rectangle, circle, ellipse and segment * annotated shapes (shapes with labels showing position and dimensions): rectangle with center position and size, circle with center position and diameter, ellipse with center position and diameters (these items are very useful to measure things directly on displayed images) curves, images and shapes: * multiple object selection for moving objects or editing their properties through automatically generated dialog boxes (guidata) * item list panel: move objects from foreground to background, show/hide objects, remove objects, ... * customizable aspect ratio * a lot of ready-to-use tools: plot canvas export to image file, image snapshot, image rectangular filter, etc. curves: * interval selection tools with labels showing results of computing on selected area * curve fitting tool with automatic fit, manual fit with sliders, ... images: * contrast adjustment panel: select the LUT by moving a range selection object on the image levels histogram, eliminate outliers, ... * X-axis and Y-axis cross-sections: support for multiple images, average cross-section tool on a rectangular area, ... * apply any affine transform to displayed images in real-time (rotation, magnification, translation, horizontal/vertical flip, ...) application development helpers: * ready-to-use curve and image plot widgets and dialog boxes * load/save graphical objects (curves, images, shapes) * a lot of test scripts which demonstrate guiqwt features guiqwt has been successfully tested on GNU/Linux and Windows platforms. Python package index page: http://pypi.python.org/pypi/guiqwt/ Documentation, screenshots: http://packages.python.org/guiqwt/ Downloads (source + Python(x,y) plugin): http://guiqwt.googlecode.com Cheers, Pierre --- Dr. Pierre Raybaut CEA - Commissariat ? l'Energie Atomique et aux Energies Alternatives From josef.pktd at gmail.com Wed Dec 15 07:43:15 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 15 Dec 2010 07:43:15 -0500 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: <201012142029.18007.faltet@pytables.org> Message-ID: On Tue, Dec 14, 2010 at 3:17 PM, Keith Goodman wrote: > On Tue, Dec 14, 2010 at 11:38 AM, ? wrote: >> On Tue, Dec 14, 2010 at 2:29 PM, Francesc Alted wrote: >>> A Tuesday 14 December 2010 20:17:10 Nils Wagner escrigu?: >>>> Hi all, >>>> >>>> I am using ATLAS >>>> >>>> python -i try_deterministic.py >>>> >>>> lstsq >>>> 0.0 >>>> 0.0 >>>> 0.0 >>>> 0.0 >>>> 0.0 >>> >>> That's interesting. ?Maybe Josef is using a threaded ATLAS? ?I >>> positively know that threading introduces variability in the order that >>> the computations are done. ?However, I'm not sure on why ATLAS has >>> decided to use several threads for so small matrices ((100, 10)) :-/ >> >> No, I'm using an old plain ATLAS with a single CPU, but if Skipper is >> getting it also with 32bit mkl, then there might still be another >> Windows "feature" in play. >> >> Since my results are identical across python session, but not within, >> the results are still deterministic and cannot depend on how busy my >> computer is. > > Come to think if it, one of my tests is to compare the output of each > new release of one of my packages with the previous release. My > colleague runs the test on a windows machine. He gets a difference in > output when there should be none. Even after pulling the installer > apart and installing the non-ATLAS version he sees the difference on > windows. It would be great to figure out what is going on. It looks like it's Windows32 specific, but not specific to an ATLAS or LAPACK version. I still don't have a clue why, but it looks like tests that use linalg have to be weakened to decimal=14 or decimal=15, larger than I thought previously. Thanks, Josef From josef.pktd at gmail.com Wed Dec 15 07:36:44 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 15 Dec 2010 07:36:44 -0500 Subject: [SciPy-User] understanding machine precision In-Reply-To: <4D07CCAF.6090706@gmail.com> References: <201012142029.18007.faltet@pytables.org> <4D07CCAF.6090706@gmail.com> Message-ID: On Tue, Dec 14, 2010 at 2:59 PM, Bruce Southey wrote: > On 12/14/2010 01:29 PM, Francesc Alted wrote: >> A Tuesday 14 December 2010 20:17:10 Nils Wagner escrigu?: >>> Hi all, >>> >>> I am using ATLAS >>> >>> python -i try_deterministic.py >>> >>> lstsq >>> 0.0 >>> 0.0 >>> 0.0 >>> 0.0 >>> 0.0 >> That's interesting. ?Maybe Josef is using a threaded ATLAS? ?I >> positively know that threading introduces variability in the order that >> the computations are done. ?However, I'm not sure on why ATLAS has >> decided to use several threads for so small matrices ((100, 10)) :-/ >> > Does this 'issue' occur with numpy's lstsq? good question, I didn't think about it. same problem and my numpy is from an official installer (1.4.0) and not compiled against the old ATLAS, that I'm still using. > > This is most probably due to the OS (Windows) and compiler as Skipper's > post indicates and probably related to the cpu as well. It may be how > the binaries are created relative to the target system especially if > different compilers are used. Skipper mentioned mkl, so I guess his scipy is compiled with microsoft visual, while mine is compiled with MingW. > > Overclocking and heat can also create issues (just as those people > finding prime numbers :-) ). I don't think my Dell factory notebook is overclocked. Josef > > Bruce > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From bsouthey at gmail.com Wed Dec 15 12:31:30 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 15 Dec 2010 11:31:30 -0600 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: <201012142029.18007.faltet@pytables.org> Message-ID: <4D08FB72.9020600@gmail.com> On 12/15/2010 06:43 AM, josef.pktd at gmail.com wrote: > On Tue, Dec 14, 2010 at 3:17 PM, Keith Goodman wrote: >> On Tue, Dec 14, 2010 at 11:38 AM, wrote: >>> On Tue, Dec 14, 2010 at 2:29 PM, Francesc Alted wrote: >>>> A Tuesday 14 December 2010 20:17:10 Nils Wagner escrigu?: >>>>> Hi all, >>>>> >>>>> I am using ATLAS >>>>> >>>>> python -i try_deterministic.py >>>>> >>>>> lstsq >>>>> 0.0 >>>>> 0.0 >>>>> 0.0 >>>>> 0.0 >>>>> 0.0 >>>> That's interesting. Maybe Josef is using a threaded ATLAS? I >>>> positively know that threading introduces variability in the order that >>>> the computations are done. However, I'm not sure on why ATLAS has >>>> decided to use several threads for so small matrices ((100, 10)) :-/ >>> No, I'm using an old plain ATLAS with a single CPU, but if Skipper is >>> getting it also with 32bit mkl, then there might still be another >>> Windows "feature" in play. >>> >>> Since my results are identical across python session, but not within, >>> the results are still deterministic and cannot depend on how busy my >>> computer is. >> Come to think if it, one of my tests is to compare the output of each >> new release of one of my packages with the previous release. My >> colleague runs the test on a windows machine. He gets a difference in >> output when there should be none. Even after pulling the installer >> apart and installing the non-ATLAS version he sees the difference on >> windows. It would be great to figure out what is going on. > It looks like it's Windows32 specific, but not specific to an ATLAS or > LAPACK version. > > I still don't have a clue why, but it looks like tests that use linalg > have to be weakened to decimal=14 or decimal=15, larger than I thought > previously. > > Thanks, > > Josef Can you please detail the versions of Python and numpy as well as how each was built? Does it use sse and does it use the gcc option '-fexcess-precision'? Given Skipper's response, it may be due to cross-compiling (including compiler and OS differences) as well as being at the limits of numerical accuracy for that precision (loss of significance or precision). (Reading gcc bug 323 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323) is interesting or not.) Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed Dec 15 05:47:03 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 15 Dec 2010 10:47:03 +0000 (UTC) Subject: [SciPy-User] Help With Scipy: Integration pack, returning an array for an array input References: <4D085CD7.5080104@gmail.com> Message-ID: Wed, 15 Dec 2010 01:14:47 -0500, Vick wrote: [clip] > I'd like to get the value of this integral over a range of t's. I don't > really understand why it's doing this, and I'd like to avoid using a for > loop. There's no way to avoid a loop here. The integration algorithm is adaptive and cannot cope with vector inputs. You can do F = numpy.vectorize(D) but speedwise this will be more or less the same as a for loop. -- Pauli Virtanen From faltet at pytables.org Wed Dec 15 13:22:04 2010 From: faltet at pytables.org (Francesc Alted) Date: Wed, 15 Dec 2010 19:22:04 +0100 Subject: [SciPy-User] understanding machine precision In-Reply-To: References: <4D07CCAF.6090706@gmail.com> Message-ID: <201012151922.04942.faltet@pytables.org> A Wednesday 15 December 2010 13:36:44 josef.pktd at gmail.com escrigu?: > On Tue, Dec 14, 2010 at 2:59 PM, Bruce Southey wrote: > > On 12/14/2010 01:29 PM, Francesc Alted wrote: > >> A Tuesday 14 December 2010 20:17:10 Nils Wagner escrigu?: > >>> Hi all, > >>> > >>> I am using ATLAS > >>> > >>> python -i try_deterministic.py > >>> > >>> lstsq > >>> 0.0 > >>> 0.0 > >>> 0.0 > >>> 0.0 > >>> 0.0 > >> > >> That's interesting. Maybe Josef is using a threaded ATLAS? I > >> positively know that threading introduces variability in the order > >> that the computations are done. However, I'm not sure on why > >> ATLAS has decided to use several threads for so small matrices > >> ((100, 10)) :-/ > > > > Does this 'issue' occur with numpy's lstsq? > > good question, I didn't think about it. > > same problem and my numpy is from an official installer (1.4.0) and > not compiled against the old ATLAS, that I'm still using. With the slightly modified script that depends only on numpy, I'm having problems of reproducibility only with pinv: C:\Users\francesc>python x:\try_deterministic_numpy.py -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Python version: 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)] NumPy version: 1.5.0 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= lstsq 0.0 0.0 0.0 0.0 0.0 pinv 0.0 5.55111512313e-17 0.0 0.0 0.0 And, curiously enough, that pattern is always the same for every run. So, at least, the issue is more deterministic that we initially thought. But, in the same machine, using Python 2.7, problem disappears completely: C:\Users\francesc>python x:\try_deterministic_numpy.py -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Python version: 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] NumPy version: 1.5.0 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= lstsq 0.0 0.0 0.0 0.0 0.0 pinv 0.0 0.0 0.0 0.0 0.0 Well, at least we have narrowed the possibilities significantly: this seems to happen only (up to now) with Win32, Python 2.6 and NumPy (without ATLAS in the loop). It is worth noting that the compilers used for Python 2.6 and 2.7 should be the same (MSVC 1500). I think I installed NumPy from sourceforge repository, so the compiler used here should also be the same. And also interesting is the fact that the 'lstsq' method is always reproducible on my machine. However, Josef also reported non- deterministic behaviour with 'lstsq' :-/ -- Francesc Alted -------------- next part -------------- A non-text attachment was scrubbed... Name: try_deterministic_numpy.py Type: text/x-python Size: 725 bytes Desc: not available URL: From laytonjb at att.net Wed Dec 15 13:53:19 2010 From: laytonjb at att.net (Jeff Layton) Date: Wed, 15 Dec 2010 13:53:19 -0500 Subject: [SciPy-User] scipy down again? Message-ID: <4D090E9F.2000905@att.net> I can't seem to access the site. Is it down again? TIA, Jeff From robert.kern at gmail.com Wed Dec 15 14:23:46 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 15 Dec 2010 13:23:46 -0600 Subject: [SciPy-User] scipy down again? In-Reply-To: <4D090E9F.2000905@att.net> References: <4D090E9F.2000905@att.net> Message-ID: On Wed, Dec 15, 2010 at 12:53, Jeff Layton wrote: > ?I can't seem to access the site. Is it down again? Yes. We're working on it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From dagss at student.matnat.uio.no Sat Dec 18 04:10:05 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 18 Dec 2010 10:10:05 +0100 Subject: [SciPy-User] understanding machine precision In-Reply-To: <201012151922.04942.faltet@pytables.org> References: <4D07CCAF.6090706@gmail.com> <201012151922.04942.faltet@pytables.org> Message-ID: <4D0C7A6D.9030001@student.matnat.uio.no> Francesc Alted wrote: > A Wednesday 15 December 2010 13:36:44 josef.pktd at gmail.com escrigu?: > >> On Tue, Dec 14, 2010 at 2:59 PM, Bruce Southey >> > wrote: > >>> On 12/14/2010 01:29 PM, Francesc Alted wrote: >>> >>>> A Tuesday 14 December 2010 20:17:10 Nils Wagner escrigu?: >>>> >>>>> Hi all, >>>>> >>>>> I am using ATLAS >>>>> >>>>> python -i try_deterministic.py >>>>> >>>>> lstsq >>>>> 0.0 >>>>> 0.0 >>>>> 0.0 >>>>> 0.0 >>>>> 0.0 >>>>> >>>> That's interesting. Maybe Josef is using a threaded ATLAS? I >>>> positively know that threading introduces variability in the order >>>> that the computations are done. However, I'm not sure on why >>>> ATLAS has decided to use several threads for so small matrices >>>> ((100, 10)) :-/ >>>> >>> Does this 'issue' occur with numpy's lstsq? >>> >> good question, I didn't think about it. >> >> same problem and my numpy is from an official installer (1.4.0) and >> not compiled against the old ATLAS, that I'm still using. >> > > With the slightly modified script that depends only on numpy, I'm having > problems of reproducibility only with pinv: > > C:\Users\francesc>python x:\try_deterministic_numpy.py > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Python version: 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 > 32 bit (Intel)] > NumPy version: 1.5.0 > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > lstsq > 0.0 > 0.0 > 0.0 > 0.0 > 0.0 > > pinv > 0.0 > 5.55111512313e-17 > 0.0 > 0.0 > 0.0 > > And, curiously enough, that pattern is always the same for every run. > So, at least, the issue is more deterministic that we initially thought. > > But, in the same machine, using Python 2.7, problem disappears > completely: > > C:\Users\francesc>python x:\try_deterministic_numpy.py > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Python version: 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 > bit (Intel)] > NumPy version: 1.5.0 > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > > lstsq > 0.0 > 0.0 > 0.0 > 0.0 > 0.0 > > pinv > 0.0 > 0.0 > 0.0 > 0.0 > 0.0 > > Well, at least we have narrowed the possibilities significantly: this > seems to happen only (up to now) with Win32, Python 2.6 and NumPy > (without ATLAS in the loop). It is worth noting that the compilers used > for Python 2.6 and 2.7 should be the same (MSVC 1500). I think I > installed NumPy from sourceforge repository, so the compiler used here > should also be the same. > > And also interesting is the fact that the 'lstsq' method is always > reproducible on my machine. However, Josef also reported non- > deterministic behaviour with 'lstsq' :-/ > How about memory alignment? It may be that a) the matrix has to be copied to new memory before passing it to LAPACK, b) you get different results depending on whether the array is allocated on 128-bit boundary or not One way to test is to stick "aligned16" (or was that "align16"?) into generic_flapack.pyf and rebuild SciPy (there's a couple of aligned8 in there you can look at). Or, see if it matters if you make your array Fortran contiguous. Dag Sverre From david at silveregg.co.jp Wed Dec 15 22:25:43 2010 From: david at silveregg.co.jp (David) Date: Thu, 16 Dec 2010 12:25:43 +0900 Subject: [SciPy-User] understanding machine precision In-Reply-To: <201012151922.04942.faltet@pytables.org> References: <4D07CCAF.6090706@gmail.com> <201012151922.04942.faltet@pytables.org> Message-ID: <4D0986B7.2040601@silveregg.co.jp> On 12/16/2010 03:22 AM, Francesc Alted wrote: > > And, curiously enough, that pattern is always the same for every run. > So, at least, the issue is more deterministic that we initially thought. The first thing that would come to my mind is FPU state changes. I don't have time to investigate the issue ATM, but that should be relatively straightforward to check (dumping the FPU state at each run, through ctypes if necessary), cheers, David From pramo4d at gmail.com Sun Dec 19 12:28:43 2010 From: pramo4d at gmail.com (Pramod) Date: Sun, 19 Dec 2010 09:28:43 -0800 (PST) Subject: [SciPy-User] I want to convert following matlab program into the python programme Message-ID: Hi Friends I Want to Convert following matlab code into the python code using scipy Highleted part : i am not geeting idea How to handle chebshev polynomial in scipy Anyone sugest How to write this in python Nmax = 50; E = zeros(3,Nmax); for N = 1:Nmax; [D,x] = cheb(N); what is syntax in scipy for this v = abs(x).^3; vprime = 3*x.*abs(x); % 3rd deriv in BV E(1,N) = norm(D*v-vprime,inf); v = exp(-x.^(-2)); vprime = 2.*v./x.^3; % C-infinity E(2,N) = norm(D*v-vprime,inf); v = 1./(1+x.^2); vprime = -2*x.*v.^2; % analytic in [-1,1] E(3,N) = norm(D*v-vprime,inf); v = x.^10; vprime = 10*x.^9; % polynomial E(4,N) = norm(D*v-vprime,inf); end I tried in Ipython following error I am getting : for i in range(1,20): ....: [D,i]=scipy.special.chebyu(3)(0.2) ERROR IS :'numpy.float64' object is not iterable Thanks in Advance From pramo4d at gmail.com Sat Dec 18 02:40:09 2010 From: pramo4d at gmail.com (Pramod) Date: Fri, 17 Dec 2010 23:40:09 -0800 (PST) Subject: [SciPy-User] Chebshev polynomial implimetation scipy Message-ID: <9d078896-48fe-4a98-9cfc-8280eb55d22a@n32g2000pre.googlegroups.com> Dear friends, How to implement Chebshev polynomial in scipy can you please give men an example please Thanks in advance From charlesr.harris at gmail.com Mon Dec 20 12:24:52 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 20 Dec 2010 10:24:52 -0700 Subject: [SciPy-User] Chebshev polynomial implimetation scipy In-Reply-To: <9d078896-48fe-4a98-9cfc-8280eb55d22a@n32g2000pre.googlegroups.com> References: <9d078896-48fe-4a98-9cfc-8280eb55d22a@n32g2000pre.googlegroups.com> Message-ID: On Sat, Dec 18, 2010 at 12:40 AM, Pramod wrote: > Dear friends, > > How to implement Chebshev polynomial in scipy can you please give men > an example please > > > > Thanks in advance > > There are Chebyshev polynomials in numpy since version 1.4. Try from numpy.polynomial import Chebyshev. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 20 12:36:56 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 20 Dec 2010 10:36:56 -0700 Subject: [SciPy-User] I want to convert following matlab program into the python programme In-Reply-To: References: Message-ID: On Sun, Dec 19, 2010 at 10:28 AM, Pramod wrote: > Hi Friends I Want to Convert following matlab code into the python > code > using scipy > Highleted part : i am not geeting idea How to handle chebshev > polynomial in > scipy > Anyone sugest How to write this in python > Nmax = 50; E = zeros(3,Nmax); > for N = 1:Nmax; > [D,x] = cheb(N); what is syntax in scipy for this > v = abs(x).^3; vprime = 3*x.*abs(x); % 3rd deriv in BV > E(1,N) = norm(D*v-vprime,inf); > v = exp(-x.^(-2)); vprime = 2.*v./x.^3; % C-infinity > E(2,N) = norm(D*v-vprime,inf); > v = 1./(1+x.^2); vprime = -2*x.*v.^2; % analytic in [-1,1] > E(3,N) = norm(D*v-vprime,inf); > v = x.^10; vprime = 10*x.^9; % polynomial > E(4,N) = norm(D*v-vprime,inf); > end > I tried in Ipython following error I am getting : > > > That code is a complete mess. What is it trying to accomplish? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pramo4d at gmail.com Sat Dec 18 11:41:20 2010 From: pramo4d at gmail.com (V Pramod) Date: Sat, 18 Dec 2010 08:41:20 -0800 (PST) Subject: [SciPy-User] Want to write Matlab program in python Message-ID: <28671188.111.1292690480949.JavaMail.geo-discussion-forums@prjs31> Hi Friends I Want to Convert following matlab code into the python code using scipy Highleted part : i am not geeting idea How to handle chebshev polynomial in scipy Anyone sugest How to write this in python Nmax = 50; E = zeros(3,Nmax); for N = 1:Nmax; [D,x] = cheb(N); what is syntax in scipy for this v = abs(x).^3; vprime = 3*x.*abs(x); % 3rd deriv in BV E(1,N) = norm(D*v-vprime,inf); v = exp(-x.^(-2)); vprime = 2.*v./x.^3; % C-infinity E(2,N) = norm(D*v-vprime,inf); v = 1./(1+x.^2); vprime = -2*x.*v.^2; % analytic in [-1,1] E(3,N) = norm(D*v-vprime,inf); v = x.^10; vprime = 10*x.^9; % polynomial E(4,N) = norm(D*v-vprime,inf); end I tried in Ipython following error I am getting : for i in range(1,20): ....: [D,i]=scipy.special.chebyu(3)(0.2) ERROR IS :'numpy.float64' object is not iterable Thanks in Advance -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Mon Dec 20 12:41:18 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 20 Dec 2010 18:41:18 +0100 Subject: [SciPy-User] EXTENDED MATRIX-MARKET FILE FORMAT Message-ID: Hi all, FWIW, there is an extended matrix-market file format. http://www.staff.science.uu.nl/~bisse101/Mondriaan/Docs/extendedMM.pdf Cheers, Nils From pramo4d at gmail.com Sun Dec 19 12:30:00 2010 From: pramo4d at gmail.com (Pramod) Date: Sun, 19 Dec 2010 09:30:00 -0800 (PST) Subject: [SciPy-User] want to convert following matlab code into the python Message-ID: Hi Friends I Want to Convert following matlab code into the python code using scipy Highleted part : i am not geeting idea How to handle chebshev polynomial in scipy Anyone sugest How to write this in python Nmax = 50; E = zeros(3,Nmax); for N = 1:Nmax; [D,x] = cheb(N); what is syntax in scipy for this v = abs(x).^3; vprime = 3*x.*abs(x); % 3rd deriv in BV E(1,N) = norm(D*v-vprime,inf); v = exp(-x.^(-2)); vprime = 2.*v./x.^3; % C-infinity E(2,N) = norm(D*v-vprime,inf); v = 1./(1+x.^2); vprime = -2*x.*v.^2; % analytic in [-1,1] E(3,N) = norm(D*v-vprime,inf); v = x.^10; vprime = 10*x.^9; % polynomial E(4,N) = norm(D*v-vprime,inf); end I tried in Ipython following error I am getting : for i in range(1,20): ....: [D,i]=scipy.special.chebyu(3)(0.2) ERROR IS :'numpy.float64' object is not iterable Thanks in Advance From pramo4d at gmail.com Sat Dec 18 13:35:37 2010 From: pramo4d at gmail.com (Pramod) Date: Sat, 18 Dec 2010 10:35:37 -0800 (PST) Subject: [SciPy-User] Chebeshev Polynomial implimentation python Message-ID: <37e1b333-6fee-4f42-9a4c-e63fdeb329c9@i25g2000prd.googlegroups.com> Matlab imple mentation : for N = 1:Nmax; [D,x] = cheb(N); How to impliment above (written in matlab ) chebshev polynomial in python From pav at iki.fi Mon Dec 20 15:18:16 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 20 Dec 2010 20:18:16 +0000 (UTC) Subject: [SciPy-User] I want to convert following matlab program into the python programme References: Message-ID: On Sun, Dec 19, 2010 at 10:28 AM, Pramod wrote: [clip] > [D,x] = cheb(N); ? what is syntax in scipy for this cheb(N) is not a standard Matlab function. Before someone can help you out here, you need to tell what this function does. -- Pauli Virtanen From seb.haase at gmail.com Mon Dec 20 15:23:17 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Mon, 20 Dec 2010 21:23:17 +0100 Subject: [SciPy-User] Reading TDM/TDMS Files with scipy In-Reply-To: References: Message-ID: Hi Nils, did you get anywhere with this ? I sounds like ctypes / numpy thing ... I'm also considering to read LabView data in binary format. And it seems that while TDMS structure is documented http://zone.ni.com/devzone/cda/tut/p/id/5696 there is a paragraph saying """ This article does not describe how to decode DAQmx data. If you need to read a TDMS file with software that implements native support for TDMS (without using any components provided by National Instruments), you will __not__ be able to interpret this data. """ So I guess, the link you gave is really the only way to go ... Cheers, Sebastian Haase On Sun, Nov 28, 2010 at 8:12 PM, Nils Wagner wrote: > Hi all, > > Is it possible to read TDM/TDMS files with scipy ? > > I found a tool for Matlab > http://zone.ni.com/devzone/cda/epd/p/id/5957 > > Nils > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From almar.klein at gmail.com Mon Dec 20 16:29:02 2010 From: almar.klein at gmail.com (Almar Klein) Date: Mon, 20 Dec 2010 22:29:02 +0100 Subject: [SciPy-User] ANN: Visvis v1.4 - The object oriented approach to visualization In-Reply-To: References: Message-ID: Hi all, I am pleased to announce release 1.4 of Visvis - The object oriented approach to visualization. Visvis is a pure Python library for visualization of 1D to 4D data in an object oriented way. Essentially, visvis is an object oriented layer of Python on top of OpenGl, thereby combining the power of OpenGl with the usability of Python. A Matlab-like interface in the form of a set of functions allows easy creation of objects (e.g. plot(), imshow(), volshow(), surf()). website: http://code.google.com/p/visvis/ Discussion group: http://groups.google.com/group/visvis/ Documentation: http://code.google.com/p/visvis/wiki/Visvis_basics Cheers, Almar === Release notes === Much work has been done since the previous release. Visvis has made a few important steps towards maturity. Also, Visvis is now BSD-licensed. The scenes are now automatically redrawn when an object's property is changed, and the Axes objects buffer their contents so they can redraw much faster when nothing has changed (for example when using multiple subplots). Using new functions movieRead and movieWrite, movies can be imported and exported to/from animated GIF, SWF (flash), AVI (needs ffmpeg), and a series of images. Visvis now uses guisupport.py, enabling seamless integration in the IEP and IPython interactive event loops. Furthermore, many docstrings have been improved and so has the script that creates the online API documentation from them. Several examples have been added and all examples are now available via the website (including images and animations). See here fore more. PS: what happened to the mailing list? Was it down for a few days? -------------- next part -------------- An HTML attachment was scrubbed... URL: From willardmaier at gmail.com Mon Dec 20 19:55:42 2010 From: willardmaier at gmail.com (Willard Maier) Date: Mon, 20 Dec 2010 18:55:42 -0600 Subject: [SciPy-User] Building scipy on Linux Message-ID: I'm trying to build scipy and numpy on Linux (Mint, a derivative of Ubuntu), and I'm following the directions on http://www.scipy.org/Installing_SciPy/Linux for "Building everything from source using gfortran on Ubuntu, Nov 2010. After several hours of work I finally got to the step for building scipy, "python setup.py build", but here I get the following error: Traceback (most recent call last): File "setup.py", line 85, in FULLVERSION += svn_version() File "setup.py", line 58, in svn_version from numpy.compat import asstr ImportError: No module named compat Not sure what I did wrong. I downloaded the latest numpy from github and the latest scipy with Subversion from http://svn.scipy.org/svn/scipy/trunk. I also got lapack, atlas, UMFPACK, AMD, and UFconfig, and built them from source. All previous steps succeeded after a bit of coaxing. Does anyone have any suggestions? Bill Maier http://bmaier.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecarlson at eng.ua.edu Mon Dec 20 22:55:09 2010 From: ecarlson at eng.ua.edu (Eric Carlson) Date: Mon, 20 Dec 2010 21:55:09 -0600 Subject: [SciPy-User] Chebeshev Polynomial implimentation python In-Reply-To: <37e1b333-6fee-4f42-9a4c-e63fdeb329c9@i25g2000prd.googlegroups.com> References: <37e1b333-6fee-4f42-9a4c-e63fdeb329c9@i25g2000prd.googlegroups.com> Message-ID: On 12/18/2010 12:35 PM, Pramod wrote: > Matlab imple mentation : > for N = 1:Nmax; > [D,x] = cheb(N); > > How to impliment above (written in matlab ) chebshev polynomial in > python cheb is not a standard matlab function, but if this is it: function [D,x]=cheb(N) if N==0, D=0; x=1; return; end x=cos(pi*(0:N)/N)'; c=[2; ones(N-1,1); 2].*(-1).^(0:N)'; X=repmat(x,1,N+1); dX=X-X'; D=(c*(1./c)')./(dX+eye(N+1)); % off diagonals D=D-diag(sum(D')); % diagonals Then a python version could be given by: from numpy import cos,pi,linspace,array,matrix,ones,hstack,eye,diag,tile def cheb(N): if N==0: D=0.0 x=1.0 else: x=cos(pi*linspace(0,N,N+1)/N) c=matrix(hstack(([2.],ones(N-1),[2.]))*(-1)**linspace(0,N,N+1)).T X=tile(x,[N+1,1]) dX=(X-X.T).T D=array(c*(1/c).T)/(dX+eye(N+1)); # off diagonals D=D-diag(sum(D,axis=1)); # diagonals return D,x (D,x)=cheb(3) print D (D,x)=cheb(4) print D I almost literally translated the matlab code, so I do not know how efficient it all is, and have given zero thought to better ways to do it. You need to be very careful with matrix and array types. Unless you KNOW you want a matrix, you probably want an array. One of the biggest transitions from matlab to python is learning to not worry about whether something is a column or row array, except when you are doing matrix multiplication. HTH From charlesr.harris at gmail.com Mon Dec 20 23:17:00 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 20 Dec 2010 21:17:00 -0700 Subject: [SciPy-User] Chebeshev Polynomial implimentation python In-Reply-To: References: <37e1b333-6fee-4f42-9a4c-e63fdeb329c9@i25g2000prd.googlegroups.com> Message-ID: On Mon, Dec 20, 2010 at 8:55 PM, Eric Carlson wrote: > On 12/18/2010 12:35 PM, Pramod wrote: > > Matlab imple mentation : > > for N = 1:Nmax; > > [D,x] = cheb(N); > > > > How to impliment above (written in matlab ) chebshev polynomial in > > python > > cheb is not a standard matlab function, but if this is it: > > function [D,x]=cheb(N) > > if N==0, D=0; x=1; return; end > x=cos(pi*(0:N)/N)'; > c=[2; ones(N-1,1); 2].*(-1).^(0:N)'; > X=repmat(x,1,N+1); > dX=X-X'; > D=(c*(1./c)')./(dX+eye(N+1)); % off diagonals > D=D-diag(sum(D')); % diagonals > > > Then a python version could be given by: > > from numpy import cos,pi,linspace,array,matrix,ones,hstack,eye,diag,tile > > def cheb(N): > > if N==0: > D=0.0 > x=1.0 > else: > x=cos(pi*linspace(0,N,N+1)/N) > c=matrix(hstack(([2.],ones(N-1),[2.]))*(-1)**linspace(0,N,N+1)).T > X=tile(x,[N+1,1]) > dX=(X-X.T).T > D=array(c*(1/c).T)/(dX+eye(N+1)); # off diagonals > D=D-diag(sum(D,axis=1)); # diagonals > > return D,x > > (D,x)=cheb(3) > print D > (D,x)=cheb(4) > print D > > > I almost literally translated the matlab code, so I do not know how > efficient it all is, and have given zero thought to better ways to do > it. You need to be very careful with matrix and array types. Unless you > KNOW you want a matrix, you probably want an array. One of the biggest > transitions from matlab to python is learning to not worry about whether > something is a column or row array, except when you are doing matrix > multiplication. > > As a guess, D is the differentiation operator for a function sampled at the Chebyshev points of the second kind. So perhaps this is intended as a method of solution for a boundary value problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 20 23:46:58 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 20 Dec 2010 21:46:58 -0700 Subject: [SciPy-User] Chebeshev Polynomial implimentation python In-Reply-To: References: <37e1b333-6fee-4f42-9a4c-e63fdeb329c9@i25g2000prd.googlegroups.com> Message-ID: On Mon, Dec 20, 2010 at 9:17 PM, Charles R Harris wrote: > > > On Mon, Dec 20, 2010 at 8:55 PM, Eric Carlson wrote: > >> On 12/18/2010 12:35 PM, Pramod wrote: >> > Matlab imple mentation : >> > for N = 1:Nmax; >> > [D,x] = cheb(N); >> > >> > How to impliment above (written in matlab ) chebshev polynomial in >> > python >> >> cheb is not a standard matlab function, but if this is it: >> >> function [D,x]=cheb(N) >> >> if N==0, D=0; x=1; return; end >> x=cos(pi*(0:N)/N)'; >> c=[2; ones(N-1,1); 2].*(-1).^(0:N)'; >> X=repmat(x,1,N+1); >> dX=X-X'; >> D=(c*(1./c)')./(dX+eye(N+1)); % off diagonals >> D=D-diag(sum(D')); % diagonals >> >> >> Then a python version could be given by: >> >> from numpy import cos,pi,linspace,array,matrix,ones,hstack,eye,diag,tile >> >> def cheb(N): >> >> if N==0: >> D=0.0 >> x=1.0 >> else: >> x=cos(pi*linspace(0,N,N+1)/N) >> c=matrix(hstack(([2.],ones(N-1),[2.]))*(-1)**linspace(0,N,N+1)).T >> X=tile(x,[N+1,1]) >> dX=(X-X.T).T >> D=array(c*(1/c).T)/(dX+eye(N+1)); # off diagonals >> D=D-diag(sum(D,axis=1)); # diagonals >> >> return D,x >> >> (D,x)=cheb(3) >> print D >> (D,x)=cheb(4) >> print D >> >> >> I almost literally translated the matlab code, so I do not know how >> efficient it all is, and have given zero thought to better ways to do >> it. You need to be very careful with matrix and array types. Unless you >> KNOW you want a matrix, you probably want an array. One of the biggest >> transitions from matlab to python is learning to not worry about whether >> something is a column or row array, except when you are doing matrix >> multiplication. >> >> > As a guess, D is the differentiation operator for a function sampled at the > Chebyshev points of the second kind. So perhaps this is intended as a method > of solution for a boundary value problem. > > If so, the following brute force method will work. In [1]: import numpy.polynomial as poly In [2]: def Cheb(N): ...: x = poly.chebpts2(N+1) ...: D = array([poly.Chebyshev.fit(x,c,N).deriv()(x) for c in eye(N+1)]) ...: return D, x However, I have a module for this type of use case which I've attached. I'm not sure what shape it's in, it's old. The relevant function would be modified_derivative Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: chebyshev.py Type: text/x-python Size: 18606 bytes Desc: not available URL: From charlesr.harris at gmail.com Mon Dec 20 23:52:15 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 20 Dec 2010 21:52:15 -0700 Subject: [SciPy-User] Chebeshev Polynomial implimentation python In-Reply-To: References: <37e1b333-6fee-4f42-9a4c-e63fdeb329c9@i25g2000prd.googlegroups.com> Message-ID: On Mon, Dec 20, 2010 at 9:46 PM, Charles R Harris wrote: > > > On Mon, Dec 20, 2010 at 9:17 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Mon, Dec 20, 2010 at 8:55 PM, Eric Carlson wrote: >> >>> On 12/18/2010 12:35 PM, Pramod wrote: >>> > Matlab imple mentation : >>> > for N = 1:Nmax; >>> > [D,x] = cheb(N); >>> > >>> > How to impliment above (written in matlab ) chebshev polynomial in >>> > python >>> >>> cheb is not a standard matlab function, but if this is it: >>> >>> function [D,x]=cheb(N) >>> >>> if N==0, D=0; x=1; return; end >>> x=cos(pi*(0:N)/N)'; >>> c=[2; ones(N-1,1); 2].*(-1).^(0:N)'; >>> X=repmat(x,1,N+1); >>> dX=X-X'; >>> D=(c*(1./c)')./(dX+eye(N+1)); % off diagonals >>> D=D-diag(sum(D')); % diagonals >>> >>> >>> Then a python version could be given by: >>> >>> from numpy import cos,pi,linspace,array,matrix,ones,hstack,eye,diag,tile >>> >>> def cheb(N): >>> >>> if N==0: >>> D=0.0 >>> x=1.0 >>> else: >>> x=cos(pi*linspace(0,N,N+1)/N) >>> c=matrix(hstack(([2.],ones(N-1),[2.]))*(-1)**linspace(0,N,N+1)).T >>> X=tile(x,[N+1,1]) >>> dX=(X-X.T).T >>> D=array(c*(1/c).T)/(dX+eye(N+1)); # off diagonals >>> D=D-diag(sum(D,axis=1)); # diagonals >>> >>> return D,x >>> >>> (D,x)=cheb(3) >>> print D >>> (D,x)=cheb(4) >>> print D >>> >>> >>> I almost literally translated the matlab code, so I do not know how >>> efficient it all is, and have given zero thought to better ways to do >>> it. You need to be very careful with matrix and array types. Unless you >>> KNOW you want a matrix, you probably want an array. One of the biggest >>> transitions from matlab to python is learning to not worry about whether >>> something is a column or row array, except when you are doing matrix >>> multiplication. >>> >>> >> As a guess, D is the differentiation operator for a function sampled at >> the Chebyshev points of the second kind. So perhaps this is intended as a >> method of solution for a boundary value problem. >> >> > If so, the following brute force method will work. > > In [1]: import numpy.polynomial as poly > > In [2]: def Cheb(N): > ...: x = poly.chebpts2(N+1) > ...: D = array([poly.Chebyshev.fit(x,c,N).deriv()(x) for c in > eye(N+1)]) > ...: return D, x > > Oops, D needs a transpose. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From linuxerwang at gmail.com Tue Dec 21 03:22:05 2010 From: linuxerwang at gmail.com (Linuxer Wang) Date: Tue, 21 Dec 2010 00:22:05 -0800 Subject: [SciPy-User] How to get an offline scipy cookbook? Message-ID: <4D1063AD.50200@gmail.com> Hi, all I am a new user of scipy. I found that scipy cookbook is a lot more helpful for me than the official docs. But I don't know where I can download an offline version of the cookbook. Is it ok for me to mirror the whole cookbook with wget? (I don't want to blow off the scipy's website by mirroring it.) Thanks. - linuxerwang From pav at iki.fi Tue Dec 21 04:37:22 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 21 Dec 2010 09:37:22 +0000 (UTC) Subject: [SciPy-User] How to get an offline scipy cookbook? References: <4D1063AD.50200@gmail.com> Message-ID: Tue, 21 Dec 2010 00:22:05 -0800, Linuxer Wang wrote: > I am a new user of scipy. I found that scipy cookbook is a lot more > helpful for me than the official docs. But I don't know where I can > download an offline version of the cookbook. Is it ok for me to mirror > the whole cookbook with wget? (I don't want to blow off the scipy's > website by mirroring it.) There's no offline version. The cookbook is not huge, so sensible wget-ing is probably OK. I'd maybe use --wait=1 or something to not bomb the server too much, though. Btw, what aspects of the main docs did you find not useful? Did you check the tutorials? Are they too long? Too unfocused? Don't cover enough features? -- Pauli Virtanen From JRadinger at gmx.at Tue Dec 21 06:06:50 2010 From: JRadinger at gmx.at (Johannes Radinger) Date: Tue, 21 Dec 2010 12:06:50 +0100 Subject: [SciPy-User] solving integration, density function Message-ID: <20101221110650.32180@gmx.net> Hello, I am really new to python and Scipy. I want to solve a integrated function with a python script and I think Scipy should do that :) My task: I do have some variables (s, m, K,) which are now absolutely set, but in future I'll get the values via another process of pyhton. s = 400 m = 0 K = 1 And have have following function: (1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2) which is the density function of the normal distribution a symetrical curve with the mean (m) of 0. The total area under the curve is 1 (100%) which is for an integration from -inf to +inf. I want to know x in the case of 99%: meaning that the integral (-x to +x) of the function is 0.99. Due to the symetry of the curve you can also set the integral from 0 to +x equal to (0.99/2): 0.99 = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), -x, x) resp. (0.99/2) = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), 0, x) How can I solve that question in Scipy/python so that I get x in the end. I don't know how to write the code... thank you very much johannes -- Sicherer, schneller und einfacher. Die aktuellen Internet-Browser - jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/atbrowser From Gregor.Thalhammer at gmail.com Tue Dec 21 07:20:47 2010 From: Gregor.Thalhammer at gmail.com (Gregor Thalhammer) Date: Tue, 21 Dec 2010 13:20:47 +0100 Subject: [SciPy-User] solving integration, density function In-Reply-To: <20101221110650.32180@gmx.net> References: <20101221110650.32180@gmx.net> Message-ID: <07515CE8-8F03-40C6-9A29-FA1AE7AE8AF1@gmail.com> Am 21.12.2010 um 12:06 schrieb Johannes Radinger: > Hello, > > I am really new to python and Scipy. > I want to solve a integrated function with a python script > and I think Scipy should do that :) > > My task: > > I do have some variables (s, m, K,) which are now absolutely set, but in future I'll get the values via another process of pyhton. > > s = 400 > m = 0 > K = 1 > > And have have following function: > (1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2) which is the density function of the normal distribution a symetrical curve with the mean (m) of 0. > > The total area under the curve is 1 (100%) which is for an integration from -inf to +inf. > I want to know x in the case of 99%: meaning that the integral (-x to +x) of the function is 0.99. Due to the symetry of the curve you can also set the integral from 0 to +x equal to (0.99/2): > > 0.99 = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), -x, x) > resp. > (0.99/2) = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), 0, x) > > How can I solve that question in Scipy/python > so that I get x in the end. I don't know how to write > the code... ---> erf(x[, out]) y=erf(z) returns the error function of complex argument defined as as 2/sqrt(pi)*integral(exp(-t**2),t=0..z) --- from scipy.special import erf, erfinv erfinv(0.99)*sqrt(2) Gregor > > thank you very much > > johannes > -- > Sicherer, schneller und einfacher. Die aktuellen Internet-Browser - > jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/atbrowser > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From JRadinger at gmx.at Tue Dec 21 07:48:27 2010 From: JRadinger at gmx.at (Johannes Radinger) Date: Tue, 21 Dec 2010 13:48:27 +0100 Subject: [SciPy-User] solving integration, density function In-Reply-To: <07515CE8-8F03-40C6-9A29-FA1AE7AE8AF1@gmail.com> References: <20101221110650.32180@gmx.net> <07515CE8-8F03-40C6-9A29-FA1AE7AE8AF1@gmail.com> Message-ID: <20101221124827.53380@gmx.net> -------- Original-Nachricht -------- > Datum: Tue, 21 Dec 2010 13:20:47 +0100 > Von: Gregor Thalhammer > An: SciPy Users List > Betreff: Re: [SciPy-User] solving integration, density function > > Am 21.12.2010 um 12:06 schrieb Johannes Radinger: > > > Hello, > > > > I am really new to python and Scipy. > > I want to solve a integrated function with a python script > > and I think Scipy should do that :) > > > > My task: > > > > I do have some variables (s, m, K,) which are now absolutely set, but in > future I'll get the values via another process of pyhton. > > > > s = 400 > > m = 0 > > K = 1 > > > > And have have following function: > > (1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2) which is the density > function of the normal distribution a symetrical curve with the mean (m) of > 0. > > > > The total area under the curve is 1 (100%) which is for an integration > from -inf to +inf. > > I want to know x in the case of 99%: meaning that the integral (-x to > +x) of the function is 0.99. Due to the symetry of the curve you can also set > the integral from 0 to +x equal to (0.99/2): > > > > 0.99 = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), -x, > x) > > resp. > > (0.99/2) = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), > 0, x) > > > > How can I solve that question in Scipy/python > > so that I get x in the end. I don't know how to write > > the code... > > > ---> > erf(x[, out]) > > y=erf(z) returns the error function of complex argument defined as > as 2/sqrt(pi)*integral(exp(-t**2),t=0..z) > --- > > from scipy.special import erf, erfinv > erfinv(0.99)*sqrt(2) > > > Gregor > Thank you Gregor, I only understand a part of your answer... I know that the integral of the density function is a error function and I know that the argument "from scipy.special import erf, erfinv" is to load the module. But how do I write the code including my orignial function so that I can modify it (I have also another function I want to integrate). how do i start? I want to save the whole code to a python-script I can then load e.g. into ArcGIS where I want to use the value of x for further calculations. thanks /johannes > > > > thank you very much > > > > johannes > > -- > > Sicherer, schneller und einfacher. Die aktuellen Internet-Browser - > > jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/atbrowser > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief! Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail From jsseabold at gmail.com Tue Dec 21 09:18:15 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 21 Dec 2010 09:18:15 -0500 Subject: [SciPy-User] solving integration, density function In-Reply-To: <20101221124827.53380@gmx.net> References: <20101221110650.32180@gmx.net> <07515CE8-8F03-40C6-9A29-FA1AE7AE8AF1@gmail.com> <20101221124827.53380@gmx.net> Message-ID: On Tue, Dec 21, 2010 at 7:48 AM, Johannes Radinger wrote: > > -------- Original-Nachricht -------- >> Datum: Tue, 21 Dec 2010 13:20:47 +0100 >> Von: Gregor Thalhammer >> An: SciPy Users List >> Betreff: Re: [SciPy-User] solving integration, density function > >> >> Am 21.12.2010 um 12:06 schrieb Johannes Radinger: >> >> > Hello, >> > >> > I am really new to python and Scipy. >> > I want to solve a integrated function with a python script >> > and I think Scipy should do that :) >> > >> > My task: >> > >> > I do have some variables (s, m, K,) which are now absolutely set, but in >> future I'll get the values via another process of pyhton. >> > >> > s = 400 >> > m = 0 >> > K = 1 >> > >> > And have have following function: >> > (1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2) which is the density >> function of the normal distribution a symetrical curve with the mean (m) of >> 0. >> > >> > The total area under the curve is 1 (100%) which is for an integration >> from -inf to +inf. >> > I want to know x in the case of 99%: meaning that the integral (-x to >> +x) of the function is 0.99. Due to the symetry of the curve you can also set >> the integral from 0 to +x equal to (0.99/2): >> > >> > 0.99 = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), -x, >> x) >> > resp. >> > (0.99/2) = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), >> 0, x) >> > >> > How can I solve that question in Scipy/python >> > so that I get x in the end. I don't know how to write >> > the code... >> >> >> ---> >> erf(x[, out]) >> >> ? ? y=erf(z) returns the error function of complex argument defined as >> ? ? as 2/sqrt(pi)*integral(exp(-t**2),t=0..z) >> --- >> >> from scipy.special import erf, erfinv >> erfinv(0.99)*sqrt(2) >> >> >> Gregor >> > > > Thank you Gregor, > I only understand a part of your answer... I know that the integral of the density function is a error function and I know that the argument "from scipy.special import erf, erfinv" is to load the module. > > But how do I write the code including my orignial function so that I can modify it (I have also another function I want to integrate). how do i start? I want to save the whole code to a python-script I can then load e.g. into ArcGIS where I want to use the value of x for further calculations. > Are you always integrating densities? If so, you don't want to use integrals probably, but you could use scipy.stats erfinv(.99)*np.sqrt(2) 2.5758293035489004 from scipy import stats stats.norm.ppf(.995) 2.5758293035489004 Skipper From JRadinger at gmx.at Tue Dec 21 09:33:16 2010 From: JRadinger at gmx.at (Johannes Radinger) Date: Tue, 21 Dec 2010 15:33:16 +0100 Subject: [SciPy-User] solving integration, density function In-Reply-To: References: <20101221110650.32180@gmx.net> <07515CE8-8F03-40C6-9A29-FA1AE7AE8AF1@gmail.com> <20101221124827.53380@gmx.net> Message-ID: <20101221143316.175560@gmx.net> -------- Original-Nachricht -------- > Datum: Tue, 21 Dec 2010 09:18:15 -0500 > Von: Skipper Seabold > An: SciPy Users List > Betreff: Re: [SciPy-User] solving integration, density function > On Tue, Dec 21, 2010 at 7:48 AM, Johannes Radinger > wrote: > > > > -------- Original-Nachricht -------- > >> Datum: Tue, 21 Dec 2010 13:20:47 +0100 > >> Von: Gregor Thalhammer > >> An: SciPy Users List > >> Betreff: Re: [SciPy-User] solving integration, density function > > > >> > >> Am 21.12.2010 um 12:06 schrieb Johannes Radinger: > >> > >> > Hello, > >> > > >> > I am really new to python and Scipy. > >> > I want to solve a integrated function with a python script > >> > and I think Scipy should do that :) > >> > > >> > My task: > >> > > >> > I do have some variables (s, m, K,) which are now absolutely set, but > in > >> future I'll get the values via another process of pyhton. > >> > > >> > s = 400 > >> > m = 0 > >> > K = 1 > >> > > >> > And have have following function: > >> > (1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2) which is the > density > >> function of the normal distribution a symetrical curve with the mean > (m) of > >> 0. > >> > > >> > The total area under the curve is 1 (100%) which is for an > integration > >> from -inf to +inf. > >> > I want to know x in the case of 99%: meaning that the integral (-x to > >> +x) of the function is 0.99. Due to the symetry of the curve you can > also set > >> the integral from 0 to +x equal to (0.99/2): > >> > > >> > 0.99 = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), > -x, > >> x) > >> > resp. > >> > (0.99/2) = > integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), > >> 0, x) > >> > > >> > How can I solve that question in Scipy/python > >> > so that I get x in the end. I don't know how to write > >> > the code... > >> > >> > >> ---> > >> erf(x[, out]) > >> > >> ? ? y=erf(z) returns the error function of complex argument defined > as > >> ? ? as 2/sqrt(pi)*integral(exp(-t**2),t=0..z) > >> --- > >> > >> from scipy.special import erf, erfinv > >> erfinv(0.99)*sqrt(2) > >> > >> > >> Gregor > >> > > > > > > Thank you Gregor, > > I only understand a part of your answer... I know that the integral of > the density function is a error function and I know that the argument "from > scipy.special import erf, erfinv" is to load the module. > > > > But how do I write the code including my orignial function so that I can > modify it (I have also another function I want to integrate). how do i > start? I want to save the whole code to a python-script I can then load e.g. > into ArcGIS where I want to use the value of x for further calculations. > > > > Are you always integrating densities? If so, you don't want to use > integrals probably, but you could use scipy.stats > > erfinv(.99)*np.sqrt(2) > 2.5758293035489004 > > from scipy import stats > > stats.norm.ppf(.995) > 2.5758293035489004 > Skipper The second function I want to integrate is different, it is a combination of two normal distributions like: 0.99 = integrate(0.6*((1/((s1*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s1*K))^2))+0,4*((1/((s2*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s2*K))^2))) and here again I know s1, s2, m and K and want to get x in the case when the integral is 0.99. What do I write into the script I want create? I think it is better if I can explain it with a graph but I don't know if I can just attach pictures to the mail-list-mail. /j _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief! Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail From apalomba at austin.rr.com Tue Dec 21 09:47:59 2010 From: apalomba at austin.rr.com (Anthony Palomba) Date: Tue, 21 Dec 2010 08:47:59 -0600 Subject: [SciPy-User] How to get an offline scipy cookbook? In-Reply-To: References: <4D1063AD.50200@gmail.com> Message-ID: If you are on windows, you can use WebZip. It is a very handy utility. It will save any site including links to your hard drive. You specify how deep you want it to go. It can then export the site as a .chm (CHM = *C*ompiled *H* T *M* L Help) file. -ap On Tue, Dec 21, 2010 at 3:37 AM, Pauli Virtanen wrote: > Tue, 21 Dec 2010 00:22:05 -0800, Linuxer Wang wrote: > > I am a new user of scipy. I found that scipy cookbook is a lot more > > helpful for me than the official docs. But I don't know where I can > > download an offline version of the cookbook. Is it ok for me to mirror > > the whole cookbook with wget? (I don't want to blow off the scipy's > > website by mirroring it.) > > There's no offline version. > > The cookbook is not huge, so sensible wget-ing is probably OK. I'd maybe > use --wait=1 or something to not bomb the server too much, though. > > Btw, what aspects of the main docs did you find not useful? > Did you check the tutorials? Are they too long? Too unfocused? Don't > cover enough features? > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Tue Dec 21 09:58:35 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 21 Dec 2010 09:58:35 -0500 Subject: [SciPy-User] solving integration, density function In-Reply-To: <20101221143316.175560@gmx.net> References: <20101221110650.32180@gmx.net> <07515CE8-8F03-40C6-9A29-FA1AE7AE8AF1@gmail.com> <20101221124827.53380@gmx.net> <20101221143316.175560@gmx.net> Message-ID: On Tue, Dec 21, 2010 at 9:33 AM, Johannes Radinger wrote: > > -------- Original-Nachricht -------- >> Datum: Tue, 21 Dec 2010 09:18:15 -0500 >> Von: Skipper Seabold >> An: SciPy Users List >> Betreff: Re: [SciPy-User] solving integration, density function > >> On Tue, Dec 21, 2010 at 7:48 AM, Johannes Radinger >> wrote: >> > >> > -------- Original-Nachricht -------- >> >> Datum: Tue, 21 Dec 2010 13:20:47 +0100 >> >> Von: Gregor Thalhammer >> >> An: SciPy Users List >> >> Betreff: Re: [SciPy-User] solving integration, density function >> > >> >> >> >> Am 21.12.2010 um 12:06 schrieb Johannes Radinger: >> >> >> >> > Hello, >> >> > >> >> > I am really new to python and Scipy. >> >> > I want to solve a integrated function with a python script >> >> > and I think Scipy should do that :) >> >> > >> >> > My task: >> >> > >> >> > I do have some variables (s, m, K,) which are now absolutely set, but >> in >> >> future I'll get the values via another process of pyhton. >> >> > >> >> > s = 400 >> >> > m = 0 >> >> > K = 1 >> >> > >> >> > And have have following function: >> >> > (1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2) which is the >> density >> >> function of the normal distribution a symetrical curve with the mean >> (m) of >> >> 0. >> >> > >> >> > The total area under the curve is 1 (100%) which is for an >> integration >> >> from -inf to +inf. >> >> > I want to know x in the case of 99%: meaning that the integral (-x to >> >> +x) of the function is 0.99. Due to the symetry of the curve you can >> also set >> >> the integral from 0 to +x equal to (0.99/2): >> >> > >> >> > 0.99 = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), >> -x, >> >> x) >> >> > resp. >> >> > (0.99/2) = >> integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), >> >> 0, x) >> >> > >> >> > How can I solve that question in Scipy/python >> >> > so that I get x in the end. I don't know how to write >> >> > the code... >> >> >> >> >> >> ---> >> >> erf(x[, out]) >> >> >> >> ? ? y=erf(z) returns the error function of complex argument defined >> as >> >> ? ? as 2/sqrt(pi)*integral(exp(-t**2),t=0..z) >> >> --- >> >> >> >> from scipy.special import erf, erfinv >> >> erfinv(0.99)*sqrt(2) >> >> >> >> >> >> Gregor >> >> >> > >> > >> > Thank you Gregor, >> > I only understand a part of your answer... I know that the integral of >> the density function is a error function and I know that the argument "from >> scipy.special import erf, erfinv" is to load the module. >> > >> > But how do I write the code including my orignial function so that I can >> modify it (I have also another function I want to integrate). how do i >> start? I want to save the whole code to a python-script I can then load e.g. >> into ArcGIS where I want to use the value of x for further calculations. >> > >> >> Are you always integrating densities? ?If so, you don't want to use >> integrals probably, but you could use scipy.stats >> >> erfinv(.99)*np.sqrt(2) >> 2.5758293035489004 >> >> from scipy import stats >> >> stats.norm.ppf(.995) >> 2.5758293035489004 > >> Skipper > > The second function I want to integrate is different, it is a combination of two normal distributions like: > > 0.99 = integrate(0.6*((1/((s1*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s1*K))^2))+0,4*((1/((s2*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s2*K))^2))) > > and here again I know s1, s2, m and K and want to get x in the case when the integral is 0.99. What do I write into the script I want create? > > I think it is better if I can explain it with a graph but I don't know if I can just attach pictures to the mail-list-mail. > The cdf is the integral of pdf and the ppf is the inverse of this function. All of these functions can take an argument for loc and scale, which in your case loc=m and scale = s1*K. I think you can get by with these. You might be able to do something like this. Not sure if this is correct with respect to your weightings, etc. I'd have to think more, but it might get you on the right track. from scipy import optimize def func(x,sigma1,sigma2,m): return .6 * stats.norm.cdf(x, loc=m, scale=sigma1) + .4 * stats.norm.cdf(x, loc=m, scale=sigma2) - .995 optimize.zeros.newton(func, 1., args=(s1*K,s2*K,m)) Skipper From josef.pktd at gmail.com Tue Dec 21 10:00:25 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 21 Dec 2010 10:00:25 -0500 Subject: [SciPy-User] solving integration, density function In-Reply-To: <20101221143316.175560@gmx.net> References: <20101221110650.32180@gmx.net> <07515CE8-8F03-40C6-9A29-FA1AE7AE8AF1@gmail.com> <20101221124827.53380@gmx.net> <20101221143316.175560@gmx.net> Message-ID: On Tue, Dec 21, 2010 at 9:33 AM, Johannes Radinger wrote: > > -------- Original-Nachricht -------- >> Datum: Tue, 21 Dec 2010 09:18:15 -0500 >> Von: Skipper Seabold >> An: SciPy Users List >> Betreff: Re: [SciPy-User] solving integration, density function > >> On Tue, Dec 21, 2010 at 7:48 AM, Johannes Radinger >> wrote: >> > >> > -------- Original-Nachricht -------- >> >> Datum: Tue, 21 Dec 2010 13:20:47 +0100 >> >> Von: Gregor Thalhammer >> >> An: SciPy Users List >> >> Betreff: Re: [SciPy-User] solving integration, density function >> > >> >> >> >> Am 21.12.2010 um 12:06 schrieb Johannes Radinger: >> >> >> >> > Hello, >> >> > >> >> > I am really new to python and Scipy. >> >> > I want to solve a integrated function with a python script >> >> > and I think Scipy should do that :) >> >> > >> >> > My task: >> >> > >> >> > I do have some variables (s, m, K,) which are now absolutely set, but >> in >> >> future I'll get the values via another process of pyhton. >> >> > >> >> > s = 400 >> >> > m = 0 >> >> > K = 1 >> >> > >> >> > And have have following function: >> >> > (1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2) which is the >> density >> >> function of the normal distribution a symetrical curve with the mean >> (m) of >> >> 0. >> >> > >> >> > The total area under the curve is 1 (100%) which is for an >> integration >> >> from -inf to +inf. >> >> > I want to know x in the case of 99%: meaning that the integral (-x to >> >> +x) of the function is 0.99. Due to the symetry of the curve you can >> also set >> >> the integral from 0 to +x equal to (0.99/2): >> >> > >> >> > 0.99 = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), >> -x, >> >> x) >> >> > resp. >> >> > (0.99/2) = >> integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), >> >> 0, x) >> >> > >> >> > How can I solve that question in Scipy/python >> >> > so that I get x in the end. I don't know how to write >> >> > the code... >> >> >> >> >> >> ---> >> >> erf(x[, out]) >> >> >> >> ? ? y=erf(z) returns the error function of complex argument defined >> as >> >> ? ? as 2/sqrt(pi)*integral(exp(-t**2),t=0..z) >> >> --- >> >> >> >> from scipy.special import erf, erfinv >> >> erfinv(0.99)*sqrt(2) >> >> >> >> >> >> Gregor >> >> >> > >> > >> > Thank you Gregor, >> > I only understand a part of your answer... I know that the integral of >> the density function is a error function and I know that the argument "from >> scipy.special import erf, erfinv" is to load the module. >> > >> > But how do I write the code including my orignial function so that I can >> modify it (I have also another function I want to integrate). how do i >> start? I want to save the whole code to a python-script I can then load e.g. >> into ArcGIS where I want to use the value of x for further calculations. >> > >> >> Are you always integrating densities? ?If so, you don't want to use >> integrals probably, but you could use scipy.stats >> >> erfinv(.99)*np.sqrt(2) >> 2.5758293035489004 >> >> from scipy import stats >> >> stats.norm.ppf(.995) >> 2.5758293035489004 > >> Skipper > > The second function I want to integrate is different, it is a combination of two normal distributions like: > > 0.99 = integrate(0.6*((1/((s1*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s1*K))^2))+0,4*((1/((s2*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s2*K))^2))) > > and here again I know s1, s2, m and K and want to get x in the case when the integral is 0.99. What do I write into the script I want create? > > I think it is better if I can explain it with a graph but I don't know if I can just attach pictures to the mail-list-mail. The generic way for finding the ppf in stats distribution, is use scipy.integrate.quad for the integration and scipy.optimize solve for finding x (I'm still too busy for a full answer) Josef > > /j > > ?_______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief! > Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From JRadinger at gmx.at Tue Dec 21 10:42:41 2010 From: JRadinger at gmx.at (Johannes Radinger) Date: Tue, 21 Dec 2010 16:42:41 +0100 Subject: [SciPy-User] solving integration, density function In-Reply-To: References: <20101221110650.32180@gmx.net> <07515CE8-8F03-40C6-9A29-FA1AE7AE8AF1@gmail.com> <20101221124827.53380@gmx.net> <20101221143316.175560@gmx.net> Message-ID: <20101221154241.151950@gmx.net> -------- Original-Nachricht -------- > Datum: Tue, 21 Dec 2010 09:58:35 -0500 > Von: Skipper Seabold > An: SciPy Users List > Betreff: Re: [SciPy-User] solving integration, density function > On Tue, Dec 21, 2010 at 9:33 AM, Johannes Radinger > wrote: > > > > -------- Original-Nachricht -------- > >> Datum: Tue, 21 Dec 2010 09:18:15 -0500 > >> Von: Skipper Seabold > >> An: SciPy Users List > >> Betreff: Re: [SciPy-User] solving integration, density function > > > >> On Tue, Dec 21, 2010 at 7:48 AM, Johannes Radinger > >> wrote: > >> > > >> > -------- Original-Nachricht -------- > >> >> Datum: Tue, 21 Dec 2010 13:20:47 +0100 > >> >> Von: Gregor Thalhammer > >> >> An: SciPy Users List > >> >> Betreff: Re: [SciPy-User] solving integration, density function > >> > > >> >> > >> >> Am 21.12.2010 um 12:06 schrieb Johannes Radinger: > >> >> > >> >> > Hello, > >> >> > > >> >> > I am really new to python and Scipy. > >> >> > I want to solve a integrated function with a python script > >> >> > and I think Scipy should do that :) > >> >> > > >> >> > My task: > >> >> > > >> >> > I do have some variables (s, m, K,) which are now absolutely set, > but > >> in > >> >> future I'll get the values via another process of pyhton. > >> >> > > >> >> > s = 400 > >> >> > m = 0 > >> >> > K = 1 > >> >> > > >> >> > And have have following function: > >> >> > (1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2) which is the > >> density > >> >> function of the normal distribution a symetrical curve with the mean > >> (m) of > >> >> 0. > >> >> > > >> >> > The total area under the curve is 1 (100%) which is for an > >> integration > >> >> from -inf to +inf. > >> >> > I want to know x in the case of 99%: meaning that the integral (-x > to > >> >> +x) of the function is 0.99. Due to the symetry of the curve you can > >> also set > >> >> the integral from 0 to +x equal to (0.99/2): > >> >> > > >> >> > 0.99 = > integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), > >> -x, > >> >> x) > >> >> > resp. > >> >> > (0.99/2) = > >> integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), > >> >> 0, x) > >> >> > > >> >> > How can I solve that question in Scipy/python > >> >> > so that I get x in the end. I don't know how to write > >> >> > the code... > >> >> > >> >> > >> >> ---> > >> >> erf(x[, out]) > >> >> > >> >> ? ? y=erf(z) returns the error function of complex argument > defined > >> as > >> >> ? ? as 2/sqrt(pi)*integral(exp(-t**2),t=0..z) > >> >> --- > >> >> > >> >> from scipy.special import erf, erfinv > >> >> erfinv(0.99)*sqrt(2) > >> >> > >> >> > >> >> Gregor > >> >> > >> > > >> > > >> > Thank you Gregor, > >> > I only understand a part of your answer... I know that the integral > of > >> the density function is a error function and I know that the argument > "from > >> scipy.special import erf, erfinv" is to load the module. > >> > > >> > But how do I write the code including my orignial function so that I > can > >> modify it (I have also another function I want to integrate). how do i > >> start? I want to save the whole code to a python-script I can then load > e.g. > >> into ArcGIS where I want to use the value of x for further > calculations. > >> > > >> > >> Are you always integrating densities? ?If so, you don't want to use > >> integrals probably, but you could use scipy.stats > >> > >> erfinv(.99)*np.sqrt(2) > >> 2.5758293035489004 > >> > >> from scipy import stats > >> > >> stats.norm.ppf(.995) > >> 2.5758293035489004 > > > >> Skipper > > > > The second function I want to integrate is different, it is a > combination of two normal distributions like: > > > > 0.99 = > integrate(0.6*((1/((s1*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s1*K))^2))+0,4*((1/((s2*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s2*K))^2))) > > > > and here again I know s1, s2, m and K and want to get x in the case when > the integral is 0.99. What do I write into the script I want create? > > > > I think it is better if I can explain it with a graph but I don't know > if I can just attach pictures to the mail-list-mail. > > > > The cdf is the integral of pdf and the ppf is the inverse of this > function. All of these functions can take an argument for loc and > scale, which in your case loc=m and scale = s1*K. I think you can get > by with these. You might be able to do something like this. Not sure > if this is correct with respect to your weightings, etc. I'd have to > think more, but it might get you on the right track. > > from scipy import optimize > > def func(x,sigma1,sigma2,m): > return .6 * stats.norm.cdf(x, loc=m, scale=sigma1) + .4 * > stats.norm.cdf(x, > loc=m, scale=sigma2) - .995 > > optimize.zeros.newton(func, 1., args=(s1*K,s2*K,m)) > ooh that is what I was looking for...you helped me a lot, I just need now to get the optimization subpackage to run...my IDE (IDLE always crashes when I want to run: "from scipy import optimize" ... I am working with Pyhton 2.6.5 and the actual Scipy package (downloaded today) on Windows 7... but thanks a lot for all your help /johannes > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- GMX.at - ?sterreichs FreeMail-Dienst mit ?ber 2 Mio Mitgliedern E-Mail, SMS & mehr! Kostenlos: http://portal.gmx.net/de/go/atfreemail From josef.pktd at gmail.com Tue Dec 21 10:50:54 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 21 Dec 2010 10:50:54 -0500 Subject: [SciPy-User] solving integration, density function In-Reply-To: References: <20101221110650.32180@gmx.net> <07515CE8-8F03-40C6-9A29-FA1AE7AE8AF1@gmail.com> <20101221124827.53380@gmx.net> <20101221143316.175560@gmx.net> Message-ID: On Tue, Dec 21, 2010 at 9:58 AM, Skipper Seabold wrote: > On Tue, Dec 21, 2010 at 9:33 AM, Johannes Radinger wrote: >> >> -------- Original-Nachricht -------- >>> Datum: Tue, 21 Dec 2010 09:18:15 -0500 >>> Von: Skipper Seabold >>> An: SciPy Users List >>> Betreff: Re: [SciPy-User] solving integration, density function >> >>> On Tue, Dec 21, 2010 at 7:48 AM, Johannes Radinger >>> wrote: >>> > >>> > -------- Original-Nachricht -------- >>> >> Datum: Tue, 21 Dec 2010 13:20:47 +0100 >>> >> Von: Gregor Thalhammer >>> >> An: SciPy Users List >>> >> Betreff: Re: [SciPy-User] solving integration, density function >>> > >>> >> >>> >> Am 21.12.2010 um 12:06 schrieb Johannes Radinger: >>> >> >>> >> > Hello, >>> >> > >>> >> > I am really new to python and Scipy. >>> >> > I want to solve a integrated function with a python script >>> >> > and I think Scipy should do that :) >>> >> > >>> >> > My task: >>> >> > >>> >> > I do have some variables (s, m, K,) which are now absolutely set, but >>> in >>> >> future I'll get the values via another process of pyhton. >>> >> > >>> >> > s = 400 >>> >> > m = 0 >>> >> > K = 1 >>> >> > >>> >> > And have have following function: >>> >> > (1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2) which is the >>> density >>> >> function of the normal distribution a symetrical curve with the mean >>> (m) of >>> >> 0. >>> >> > >>> >> > The total area under the curve is 1 (100%) which is for an >>> integration >>> >> from -inf to +inf. >>> >> > I want to know x in the case of 99%: meaning that the integral (-x to >>> >> +x) of the function is 0.99. Due to the symetry of the curve you can >>> also set >>> >> the integral from 0 to +x equal to (0.99/2): >>> >> > >>> >> > 0.99 = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), >>> -x, >>> >> x) >>> >> > resp. >>> >> > (0.99/2) = >>> integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), >>> >> 0, x) >>> >> > >>> >> > How can I solve that question in Scipy/python >>> >> > so that I get x in the end. I don't know how to write >>> >> > the code... >>> >> >>> >> >>> >> ---> >>> >> erf(x[, out]) >>> >> >>> >> ? ? y=erf(z) returns the error function of complex argument defined >>> as >>> >> ? ? as 2/sqrt(pi)*integral(exp(-t**2),t=0..z) >>> >> --- >>> >> >>> >> from scipy.special import erf, erfinv >>> >> erfinv(0.99)*sqrt(2) >>> >> >>> >> >>> >> Gregor >>> >> >>> > >>> > >>> > Thank you Gregor, >>> > I only understand a part of your answer... I know that the integral of >>> the density function is a error function and I know that the argument "from >>> scipy.special import erf, erfinv" is to load the module. >>> > >>> > But how do I write the code including my orignial function so that I can >>> modify it (I have also another function I want to integrate). how do i >>> start? I want to save the whole code to a python-script I can then load e.g. >>> into ArcGIS where I want to use the value of x for further calculations. >>> > >>> >>> Are you always integrating densities? ?If so, you don't want to use >>> integrals probably, but you could use scipy.stats >>> >>> erfinv(.99)*np.sqrt(2) >>> 2.5758293035489004 >>> >>> from scipy import stats >>> >>> stats.norm.ppf(.995) >>> 2.5758293035489004 >> >>> Skipper >> >> The second function I want to integrate is different, it is a combination of two normal distributions like: >> >> 0.99 = integrate(0.6*((1/((s1*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s1*K))^2))+0,4*((1/((s2*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s2*K))^2))) >> >> and here again I know s1, s2, m and K and want to get x in the case when the integral is 0.99. What do I write into the script I want create? >> >> I think it is better if I can explain it with a graph but I don't know if I can just attach pictures to the mail-list-mail. >> > > The cdf is the integral of pdf and the ppf is the inverse of this > function. ?All of these functions can take an argument for loc and > scale, which in your case loc=m and scale = s1*K. ?I think you can get > by with these. ?You might be able to do something like this. ?Not sure > if this is correct with respect to your weightings, etc. I'd have to > think more, but it might get you on the right track. > > from scipy import optimize > > def func(x,sigma1,sigma2,m): > ? ?return .6 * stats.norm.cdf(x, loc=m, scale=sigma1) + .4 * stats.norm.cdf(x, > ? ? ? ?loc=m, scale=sigma2) - .995 > > optimize.zeros.newton(func, 1., args=(s1*K,s2*K,m)) I think it's better to stick with the main namespace optimize.newton. I think scipy.stats.distributions are using fsolve. Skipper's way is the most direct way to calculate this. Johannes, scipy.stats.distribution has a lot of generic functions/methods to work with distributions, and reading the source for some of them might be useful to you. Another alternative is to subclass stats.distribution and take advantage of the generic methods directly. But compared to Skipper's direct solution, this would be only worth it if you need more properties. An old example of mine is at http://mail.scipy.org/pipermail/scipy-user/2009-May/021182.html (discussion was more for estimation) Josef > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From linuxerwang at gmail.com Tue Dec 21 12:04:17 2010 From: linuxerwang at gmail.com (Linuxer Wang) Date: Tue, 21 Dec 2010 09:04:17 -0800 Subject: [SciPy-User] How to get an offline scipy cookbook? In-Reply-To: References: <4D1063AD.50200@gmail.com> Message-ID: <4D10DE11.1050903@gmail.com> Thank you, Pauli! Sorry I didn't express it appropriately. The main doc is actually really good. I like the cookbook because some of the examples directly matches my specialty. And its structure is more suitable for jumping directly to what I am interested in (and skip all the others). Best Regards, - linuxerwang On 12/21/2010 01:37 AM, Pauli Virtanen wrote: > Tue, 21 Dec 2010 00:22:05 -0800, Linuxer Wang wrote: >> I am a new user of scipy. I found that scipy cookbook is a lot more >> helpful for me than the official docs. But I don't know where I can >> download an offline version of the cookbook. Is it ok for me to mirror >> the whole cookbook with wget? (I don't want to blow off the scipy's >> website by mirroring it.) > There's no offline version. > > The cookbook is not huge, so sensible wget-ing is probably OK. I'd maybe > use --wait=1 or something to not bomb the server too much, though. > > Btw, what aspects of the main docs did you find not useful? > Did you check the tutorials? Are they too long? Too unfocused? Don't > cover enough features? > From linuxerwang at gmail.com Tue Dec 21 12:05:56 2010 From: linuxerwang at gmail.com (Linuxer Wang) Date: Tue, 21 Dec 2010 09:05:56 -0800 Subject: [SciPy-User] How to get an offline scipy cookbook? In-Reply-To: References: <4D1063AD.50200@gmail.com> Message-ID: <4D10DE74.4040408@gmail.com> On 12/21/2010 06:47 AM, Anthony Palomba wrote: > If you are on windows, you can use WebZip. It is a very handy > utility. It will save any site including links to your hard drive. You > specify how deep you want it to go. It can then export the site > as a .chm (CHM = *C*ompiled *H* T *M* L Help) file. > Thank you. I work on linux exclusively and wget can do everything. > > > -ap > > > > On Tue, Dec 21, 2010 at 3:37 AM, Pauli Virtanen > wrote: > > Tue, 21 Dec 2010 00:22:05 -0800, Linuxer Wang wrote: > > I am a new user of scipy. I found that scipy cookbook is a lot more > > helpful for me than the official docs. But I don't know where I can > > download an offline version of the cookbook. Is it ok for me to > mirror > > the whole cookbook with wget? (I don't want to blow off the scipy's > > website by mirroring it.) > > There's no offline version. > > The cookbook is not huge, so sensible wget-ing is probably OK. I'd > maybe > use --wait=1 or something to not bomb the server too much, though. > > Btw, what aspects of the main docs did you find not useful? > Did you check the tutorials? Are they too long? Too unfocused? Don't > cover enough features? > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_baddeley at yahoo.com.au Tue Dec 21 15:22:07 2010 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Tue, 21 Dec 2010 12:22:07 -0800 (PST) Subject: [SciPy-User] solving integration, density function In-Reply-To: <20101221143316.175560@gmx.net> References: <20101221110650.32180@gmx.net> <07515CE8-8F03-40C6-9A29-FA1AE7AE8AF1@gmail.com> <20101221124827.53380@gmx.net> <20101221143316.175560@gmx.net> Message-ID: <534022.44467.qm@web113419.mail.gq1.yahoo.com> For your 2nd function, you should be able to use the linearity of integration and integrate the two terms separately, giving you the sum of two error functions. For a general case when you can't / don't want to find an analytic solution, the stuff for numeric integration is in scipy.integrate, as Josef mentioned. cheers, David ----- Original Message ---- From: Johannes Radinger . To: SciPy Users List Sent: Wed, 22 December, 2010 3:33:16 AMSubject: Re: [SciPy-User] solving integration, density function -------- Original-Nachricht -------- > Datum: Tue, 21 Dec 2010 09:18:15 -0500 > Von: Skipper Seabold > An: SciPy Users List > Betreff: Re: [SciPy-User] solving integration, density function > On Tue, Dec 21, 2010 at 7:48 AM, Johannes Radinger > wrote: > > > > -------- Original-Nachricht -------- > >> Datum: Tue, 21 Dec 2010 13:20:47 +0100 > >> Von: Gregor Thalhammer > >> An: SciPy Users List > >> Betreff: Re: [SciPy-User] solving integration, density function > > > >> > >> Am 21.12.2010 um 12:06 schrieb Johannes Radinger: > >> > >> > Hello, > >> > > >> > I am really new to python and Scipy. > >> > I want to solve a integrated function with a python script > >> > and I think Scipy should do that :) > >> > > >> > My task: > >> > > >> > I do have some variables (s, m, K,) which are now absolutely set, but > in > >> future I'll get the values via another process of pyhton. > >> > > >> > s = 400 > >> > m = 0 > >> > K = 1 > >> > > >> > And have have following function: > >> > (1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2) which is the > density > >> function of the normal distribution a symetrical curve with the mean > (m) of > >> 0. > >> > > >> > The total area under the curve is 1 (100%) which is for an > integration > >> from -inf to +inf. > >> > I want to know x in the case of 99%: meaning that the integral (-x to > >> +x) of the function is 0.99. Due to the symetry of the curve you can > also set > >> the integral from 0 to +x equal to (0.99/2): > >> > > >> > 0.99 = integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), > -x, > >> x) > >> > resp. > >> > (0.99/2) = > integral((1/((s*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s*K))^2)), > >> 0, x) > >> > > >> > How can I solve that question in Scipy/python > >> > so that I get x in the end. I don't know how to write > >> > the code... > >> > >> > >> ---> > >> erf(x[, out]) > >> > >> y=erf(z) returns the error function of complex argument defined > as > >> as 2/sqrt(pi)*integral(exp(-t**2),t=0..z) > >> --- > >> > >> from scipy.special import erf, erfinv > >> erfinv(0.99)*sqrt(2) > >> > >> > >> Gregor > >> > > > > > > Thank you Gregor, > > I only understand a part of your answer... I know that the integral of > the density function is a error function and I know that the argument "from > scipy.special import erf, erfinv" is to load the module. > > > > But how do I write the code including my orignial function so that I can > modify it (I have also another function I want to integrate). how do i > start? I want to save the whole code to a python-script I can then load e.g. > into ArcGIS where I want to use the value of x for further calculations. > > > > Are you always integrating densities? If so, you don't want to use > integrals probably, but you could use scipy.stats > > erfinv(.99)*np.sqrt(2) > 2.5758293035489004 > > from scipy import stats > > stats.norm.ppf(.995) > 2.5758293035489004 > Skipper The second function I want to integrate is different, it is a combination of two normal distributions like: 0.99 = integrate(0.6*((1/((s1*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s1*K))^2))+0,4*((1/((s2*K)*sqrt(2*pi)))*exp((-1/2*(((x-m)/s2*K))^2))) and here again I know s1, s2, m and K and want to get x in the case when the integral is 0.99. What do I write into the script I want create? I think it is better if I can explain it with a graph but I don't know if I can just attach pictures to the mail-list-mail. /j _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief! Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From scott.sinclair.za at gmail.com Wed Dec 22 06:44:38 2010 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Wed, 22 Dec 2010 13:44:38 +0200 Subject: [SciPy-User] Building scipy on Linux In-Reply-To: References: Message-ID: On 21 December 2010 02:55, Willard Maier wrote: > I'm trying to build scipy and numpy on Linux (Mint, a derivative of Ubuntu), > and I'm following the directions on > http://www.scipy.org/Installing_SciPy/Linux > for "Building everything from source using gfortran on Ubuntu, Nov 2010. > After several hours of work I finally got to the step for building scipy, > "python setup.py build", but here I get the following error: > > Traceback (most recent call last): > ? File "setup.py", line 85, in > ??? FULLVERSION += svn_version() > ? File "setup.py", line 58, in svn_version > ??? from numpy.compat import asstr > ImportError: No module named compat It sounds like you don't have Numpy installed. Building (and running) Scipy relies on Numpy... Cheers, Scott From dejan.org at gmail.com Wed Dec 22 12:47:19 2010 From: dejan.org at gmail.com (otrov) Date: Wed, 22 Dec 2010 18:47:19 +0100 Subject: [SciPy-User] Identify unique sequence data from array Message-ID: <42523011.20101222184719@gmail.com> Hi, I tried to seek for help on three other lists, but as this problem apparently can't be easily solved in matlab/octave(!?), I thought to try scipy/numpy and maybe gain advantage from python as more feature rich descriptive language The problem: I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which consists of repeated sequences of one unique sequence, usually ~10^5 rows, but may differ in scale. Period is same for both columns, so there is not really difference if we consider 2D or 1D array. I want to track this data block. Simplified problem: X = array([[1, 2], [1, 2], [2, 2], [3, 1], [2, 3], [1, 2], [1, 2], [2, 2], [3, 1], [2, 3], [1, 2], [1, 2], [2, 2], [3, 1], [2, 3], ..., [1, 2], [1, 2], [2, 2], [3, 1], [2, 3]] I would like to extract repeated sequence data: Y = array([[1, 2], [1, 2], [2, 2], [3, 1], [2, 3]] as a result. Or presented more visually: I want to identify unique sequence data: A B C D D D A B C D D D A B C D D D |_________| |_________| |_________| | | | unique unique unique sequence sequence sequence data data data Thanks for your time From faltet at pytables.org Wed Dec 22 13:58:41 2010 From: faltet at pytables.org (Francesc Alted) Date: Wed, 22 Dec 2010 19:58:41 +0100 Subject: [SciPy-User] ANN: carray 0.3 released Message-ID: <201012221958.41105.faltet@pytables.org> ===================== Announcing carray 0.3 ===================== What's new ========== A lot of stuff. The most outstanding feature in this version is the introduction of a `ctable` object. A `ctable` is similar to a structured array in NumPy, but instead of storing the data row-wise, it uses a column-wise arrangement. This allows for much better performance for very wide tables, which is one of the scenarios where a `ctable` makes more sense. Of course, as `ctable` is based on `carray` objects, it inherits all its niceties (like on-the-flight compression and fast iterators). Also, the `carray` object itself has received many improvements, like new constructors (arange(), fromiter(), zeros(), ones(), fill()), iterators (where(), wheretrue()) or resize mehtods (resize(), trim()). Most of these also work with the new `ctable`. Besides, Numexpr is supported now (but it is optional) in order to carry out stunningly fast queries on `ctable` objects. For example, doing a query on a table with one million rows and one thousand columns can be up to 2x faster than using a plain structured array, and up to 20x faster than using SQLite (using the ":memory:" backend and indexing). See 'bench/ctable-query.py' for details. Finally, binaries for Windows (both 32-bit and 64-bit) are provided. For more detailed info, see the release notes in: https://github.com/FrancescAlted/carray/wiki/Release-0.3 What it is ========== carray is a container for numerical data that can be compressed in-memory. The compression process is carried out internally by Blosc, a high-performance compressor that is optimized for binary data. Having data compressed in-memory can reduce the stress of the memory subsystem. The net result is that carray operations may be faster than using a traditional ndarray object from NumPy. carray also supports fully 64-bit addressing (both in UNIX and Windows). Below, a carray with 1 trillion of rows has been created (7.3 TB total), filled with zeros, modified some positions, and finally, summed-up:: >>> %time b = ca.zeros(1e12) CPU times: user 54.76 s, sys: 0.03 s, total: 54.79 s Wall time: 55.23 s >>> %time b[[1, 1e9, 1e10, 1e11, 1e12-1]] = (1,2,3,4,5) CPU times: user 2.08 s, sys: 0.00 s, total: 2.08 s Wall time: 2.09 s >>> b carray((1000000000000,), float64) nbytes: 7450.58 GB; cbytes: 2.27 GB; ratio: 3275.35 cparams := cparams(clevel=5, shuffle=True) [0.0, 1.0, 0.0, ..., 0.0, 0.0, 5.0] >>> %time b.sum() CPU times: user 10.08 s, sys: 0.00 s, total: 10.08 s Wall time: 10.15 s 15.0 ['%time' is a magic function provided by the IPyhton shell] Please note that the example above is provided for demonstration purposes only. Do not try to run this at home unless you have more than 3 GB of RAM available, or you will get into trouble. Resources ========= Visit the main carray site repository at: http://github.com/FrancescAlted/carray You can download a source package from: http://carray.pytables.org/download Manual: http://carray.pytables.org/manual Home of Blosc compressor: http://blosc.pytables.org User's mail list: carray at googlegroups.com http://groups.google.com/group/carray Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. ---- Enjoy! -- Francesc Alted From robert.kern at gmail.com Wed Dec 22 14:52:17 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 22 Dec 2010 14:52:17 -0500 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: <42523011.20101222184719@gmail.com> References: <42523011.20101222184719@gmail.com> Message-ID: On Wed, Dec 22, 2010 at 12:47, otrov wrote: > Hi, > I tried to seek for help on three other lists, but as this problem apparently can't be easily solved in matlab/octave(!?), I thought to try scipy/numpy and maybe gain advantage from python as more feature rich descriptive language > > The problem: > > I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which consists of repeated sequences of one unique sequence, usually ~10^5 rows, but may differ in scale. Period is same for both columns, so there is not really difference if we consider 2D or 1D array. > I want to track this data block. for i in range(1, len(X)-1): if (X[i:] == X[:-i]).all(): break -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From dejan.org at gmail.com Wed Dec 22 15:18:03 2010 From: dejan.org at gmail.com (otrov) Date: Wed, 22 Dec 2010 21:18:03 +0100 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: References: <42523011.20101222184719@gmail.com> Message-ID: <225079401.20101222211803@gmail.com> >> The problem: >> I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which consists of repeated sequences of one unique sequence, usually ~10^5 rows, but may differ in scale. Period is same for both columns, so there is not really difference if we consider 2D or 1D array. >> I want to track this data block. > for i in range(1, len(X)-1): > if (X[i:] == X[:-i]).all(): > break Just look at that python beauty! Such a great language when in hand of a smart user. Thanks for you snippet, but unfortunately it takes forever to finish the task From josef.pktd at gmail.com Wed Dec 22 15:27:36 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 22 Dec 2010 15:27:36 -0500 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: <225079401.20101222211803@gmail.com> References: <42523011.20101222184719@gmail.com> <225079401.20101222211803@gmail.com> Message-ID: On Wed, Dec 22, 2010 at 3:18 PM, otrov wrote: >>> The problem: > >>> I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which consists of repeated sequences of one unique sequence, usually ~10^5 rows, but may differ in scale. Period is same for both columns, so there is not really difference if we consider 2D or 1D array. >>> I want to track this data block. > >> for i in range(1, len(X)-1): >> ? ? if (X[i:] == X[:-i]).all(): >> ? ? ? ? break I don't see how this works, isn't it (X[:i] == X[-i:]).all(): with an integer repeat, there should also be a restriction that n/i is an int, otherwise the repeat is not possible. if n//i != n/float(i): continue or mod == 0 ? Josef > > Just look at that python beauty! Such a great language when in hand of a smart user. > Thanks for you snippet, but unfortunately it takes forever to finish the task > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From robert.kern at gmail.com Wed Dec 22 15:57:07 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 22 Dec 2010 15:57:07 -0500 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: References: <42523011.20101222184719@gmail.com> <225079401.20101222211803@gmail.com> Message-ID: On Wed, Dec 22, 2010 at 15:27, wrote: > On Wed, Dec 22, 2010 at 3:18 PM, otrov wrote: >>>> The problem: >> >>>> I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which consists of repeated sequences of one unique sequence, usually ~10^5 rows, but may differ in scale. Period is same for both columns, so there is not really difference if we consider 2D or 1D array. >>>> I want to track this data block. >> >>> for i in range(1, len(X)-1): >>> ? ? if (X[i:] == X[:-i]).all(): >>> ? ? ? ? break > > I don't see how this works, isn't it > > (X[:i] == X[-i:]).all(): Not if the repeated subsequence is [1, 2, 3, 1, 2]. That said, my method probably also has a counterexample. > with an integer repeat, there should also be a restriction that n/i is > an int, otherwise the repeat is not possible. > > if n//i != n/float(i): continue > > or mod == 0 I allowed for the sequence to have some incomplete part of the repeated section at the tail end. If the sequence is a perfect multiple, then you can avoid doing the expensive test if (n % i) != 0. For a 1D sequence, you can also try reshaping: for i in range(2, len(X)//2): if (n % i) != 0: continue Y = X.reshape((-1, i)) if (Y == Y[0]).all(): break For 2D sequences (probably): rowlen = X.shape[1] for i in range(2, len(X)//2): if (n % i) != 0: continue Y = X.reshape((-1, i, rowlen)) if (Y == Y[0]).all(): break -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From paul.anton.letnes at gmail.com Wed Dec 22 16:47:30 2010 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Wed, 22 Dec 2010 22:47:30 +0100 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: <225079401.20101222211803@gmail.com> References: <42523011.20101222184719@gmail.com> <225079401.20101222211803@gmail.com> Message-ID: <800F4E15-FF35-43F6-A37E-427B67EBC259@gmail.com> On 22. des. 2010, at 21.18, otrov wrote: >>> The problem: > >>> I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which consists of repeated sequences of one unique sequence, usually ~10^5 rows, but may differ in scale. Period is same for both columns, so there is not really difference if we consider 2D or 1D array. >>> I want to track this data block. > >> for i in range(1, len(X)-1): >> if (X[i:] == X[:-i]).all(): >> break > > Just look at that python beauty! Such a great language when in hand of a smart user. > Thanks for you snippet, but unfortunately it takes forever to finish the task You could also check one element at a time. I think it will be faster, because it will break if comparison of the first element doesn't hold. Then, if you find such an occurrence, use Robert's method to double check that you found the true repetition period. Code: >>> a = [1,2,3,4,1,2,3,4,1,2,3,4] >>> a = numpy.array(a) >>> for i in range(1, 1+a.size/2): ... if (a[0] == a[::i]).all(): print 'period is ',i ... ... period is 4 Cheers Paul. From robert.kern at gmail.com Wed Dec 22 16:51:13 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 22 Dec 2010 16:51:13 -0500 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: <800F4E15-FF35-43F6-A37E-427B67EBC259@gmail.com> References: <42523011.20101222184719@gmail.com> <225079401.20101222211803@gmail.com> <800F4E15-FF35-43F6-A37E-427B67EBC259@gmail.com> Message-ID: On Wed, Dec 22, 2010 at 16:47, Paul Anton Letnes wrote: > > On 22. des. 2010, at 21.18, otrov wrote: > >>>> The problem: >> >>>> I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which consists of repeated sequences of one unique sequence, usually ~10^5 rows, but may differ in scale. Period is same for both columns, so there is not really difference if we consider 2D or 1D array. >>>> I want to track this data block. >> >>> for i in range(1, len(X)-1): >>> ? ?if (X[i:] == X[:-i]).all(): >>> ? ? ? ?break >> >> Just look at that python beauty! Such a great language when in hand of a smart user. >> Thanks for you snippet, but unfortunately it takes forever to finish the task > > You could also check one element at a time. I think it will be faster, because it will break if comparison of the first element doesn't hold. Then, if you find such an occurrence, use Robert's method to double check that you found the true repetition period. Excellent point. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From willardmaier at gmail.com Wed Dec 22 17:54:17 2010 From: willardmaier at gmail.com (Willard Maier) Date: Wed, 22 Dec 2010 16:54:17 -0600 Subject: [SciPy-User] Building scipy on Linux Message-ID: Scott, thanks for the reply. However I did build numpy first in the sequence, and it built without errors. I'm beginning to think I may have incompatible versions of the source code. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Dec 22 18:00:36 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 22 Dec 2010 16:00:36 -0700 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: References: <42523011.20101222184719@gmail.com> <225079401.20101222211803@gmail.com> <800F4E15-FF35-43F6-A37E-427B67EBC259@gmail.com> Message-ID: On Wed, Dec 22, 2010 at 2:51 PM, Robert Kern wrote: > On Wed, Dec 22, 2010 at 16:47, Paul Anton Letnes > wrote: > > > > On 22. des. 2010, at 21.18, otrov wrote: > > > >>>> The problem: > >> > >>>> I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which > consists of repeated sequences of one unique sequence, usually ~10^5 rows, > but may differ in scale. Period is same for both columns, so there is not > really difference if we consider 2D or 1D array. > >>>> I want to track this data block. > >> > >>> for i in range(1, len(X)-1): > >>> if (X[i:] == X[:-i]).all(): > >>> break > >> > >> Just look at that python beauty! Such a great language when in hand of a > smart user. > >> Thanks for you snippet, but unfortunately it takes forever to finish the > task > > > > You could also check one element at a time. I think it will be faster, > because it will break if comparison of the first element doesn't hold. Then, > if you find such an occurrence, use Robert's method to double check that you > found the true repetition period. > > Excellent point. > > Why not do an FFT and look at the shape around the carrier frequency? The DC level should probably be subtracted first. It shoud also be possible to construct a Weiner filter to extract the sequences if they don't occur with strict periods. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From willardmaier at gmail.com Wed Dec 22 18:15:23 2010 From: willardmaier at gmail.com (Bill Maier) Date: Wed, 22 Dec 2010 17:15:23 -0600 Subject: [SciPy-User] Building scipy on Linux References: AANLkTim1oh5JV53V_Xa9XTwqGQ-EzcDmiscfSm7zOgkM@mail.gmail.com Message-ID: <4D12868B.8020307@gmail.com> Scott, thanks for the reply. However I did build numpy first in the sequence, and it built without errors. I'm beginning to think I may have incompatible versions of the source code. (Hoping this one makes it as a reply rather than posting as a new thread). Bill From charlesr.harris at gmail.com Wed Dec 22 18:48:29 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 22 Dec 2010 16:48:29 -0700 Subject: [SciPy-User] Building scipy on Linux In-Reply-To: References: Message-ID: On Wed, Dec 22, 2010 at 3:54 PM, Willard Maier wrote: > Scott, thanks for the reply. However I did build numpy first in the > sequence, and it built without errors. I'm beginning to think I may have > incompatible versions of the source code. > > On my system ubuntu installs a self compiled versions of numpy in /usr/local/lib/python2.6/dist-packages/ where it won't normally be found if you have already got a version installed in /usr/lib/python2.6/dist-packages/. One solution to that problem is to make a *.pth file $charris at ubuntu ~$ cat .local/lib/python2.6/site-packages/install.pth /usr/local/lib/python2.6/dist-packages Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Wed Dec 22 23:49:02 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 22 Dec 2010 23:49:02 -0500 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: References: <42523011.20101222184719@gmail.com> <225079401.20101222211803@gmail.com> <800F4E15-FF35-43F6-A37E-427B67EBC259@gmail.com> Message-ID: On Wed, Dec 22, 2010 at 6:00 PM, Charles R Harris wrote: > > > On Wed, Dec 22, 2010 at 2:51 PM, Robert Kern wrote: >> >> On Wed, Dec 22, 2010 at 16:47, Paul Anton Letnes >> wrote: >> > >> > On 22. des. 2010, at 21.18, otrov wrote: >> > >> >>>> The problem: >> >> >> >>>> I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which >> >>>> consists of repeated sequences of one unique sequence, usually ~10^5 rows, >> >>>> but may differ in scale. Period is same for both columns, so there is not >> >>>> really difference if we consider 2D or 1D array. >> >>>> I want to track this data block. >> >> >> >>> for i in range(1, len(X)-1): >> >>> ? ?if (X[i:] == X[:-i]).all(): >> >>> ? ? ? ?break >> >> >> >> Just look at that python beauty! Such a great language when in hand of >> >> a smart user. >> >> Thanks for you snippet, but unfortunately it takes forever to finish >> >> the task >> > >> > You could also check one element at a time. I think it will be faster, >> > because it will break if comparison of the first element doesn't hold. Then, >> > if you find such an occurrence, use Robert's method to double check that you >> > found the true repetition period. >> >> Excellent point. >> > > Why not do an FFT and look at the shape around the carrier frequency? The DC > level should probably be subtracted first. It shoud also be possible to > construct a Weiner filter to extract the sequences if they don't occur with > strict periods. > Could you give an example? I've used a convolution to find the number of successive discrete events, but I'm not sure how to generalize it or if your suggestion is similar. For example, to count the number of three successes in a row for Bernoulli trials and to find where In [1]: import numpy as np In [2]: x = np.array([1,1,1,0,0,1,0,1,0,1,1,1,0,1,0,1]) In [3]: y = np.array([1,1,1]) In [4]: idx = np.convolve(x,y) In [5]: num_runs = len(np.where(idx==len(y))[0]) In [6]: # Extract runs from original array In [6]: idx = np.where(idx==len(y))[0]-(len(y)-1) In [7]: idx = np.hstack([np.arange(i,i+3) for i in idx]) In [8]: x[idx] Out[8]: array([1, 1, 1, 1, 1, 1]) Skipper From josef.pktd at gmail.com Thu Dec 23 00:12:06 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 23 Dec 2010 00:12:06 -0500 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: References: <42523011.20101222184719@gmail.com> <225079401.20101222211803@gmail.com> <800F4E15-FF35-43F6-A37E-427B67EBC259@gmail.com> Message-ID: On Wed, Dec 22, 2010 at 11:49 PM, Skipper Seabold wrote: > On Wed, Dec 22, 2010 at 6:00 PM, Charles R Harris > wrote: >> >> >> On Wed, Dec 22, 2010 at 2:51 PM, Robert Kern wrote: >>> >>> On Wed, Dec 22, 2010 at 16:47, Paul Anton Letnes >>> wrote: >>> > >>> > On 22. des. 2010, at 21.18, otrov wrote: >>> > >>> >>>> The problem: >>> >> >>> >>>> I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which >>> >>>> consists of repeated sequences of one unique sequence, usually ~10^5 rows, >>> >>>> but may differ in scale. Period is same for both columns, so there is not >>> >>>> really difference if we consider 2D or 1D array. >>> >>>> I want to track this data block. >>> >> >>> >>> for i in range(1, len(X)-1): >>> >>> ? ?if (X[i:] == X[:-i]).all(): >>> >>> ? ? ? ?break >>> >> >>> >> Just look at that python beauty! Such a great language when in hand of >>> >> a smart user. >>> >> Thanks for you snippet, but unfortunately it takes forever to finish >>> >> the task >>> > >>> > You could also check one element at a time. I think it will be faster, >>> > because it will break if comparison of the first element doesn't hold. Then, >>> > if you find such an occurrence, use Robert's method to double check that you >>> > found the true repetition period. >>> >>> Excellent point. >>> >> >> Why not do an FFT and look at the shape around the carrier frequency? The DC >> level should probably be subtracted first. It shoud also be possible to >> construct a Weiner filter to extract the sequences if they don't occur with >> strict periods. >> > > Could you give an example? ?I've used a convolution to find the number > of successive discrete events, but I'm not sure how to generalize it > or if your suggestion is similar. > > For example, to count the number of three successes in a row for > Bernoulli trials and to find where > > In [1]: import numpy as np > > In [2]: x = np.array([1,1,1,0,0,1,0,1,0,1,1,1,0,1,0,1]) > > In [3]: y = np.array([1,1,1]) > > In [4]: idx = np.convolve(x,y) > > In [5]: num_runs = len(np.where(idx==len(y))[0]) > > In [6]: # Extract runs from original array > > In [6]: idx = np.where(idx==len(y))[0]-(len(y)-1) > > In [7]: idx = np.hstack([np.arange(i,i+3) for i in idx]) > > In [8]: x[idx] > Out[8]: array([1, 1, 1, 1, 1, 1]) It's different because in the original case you want to find the periodicity, for example calculate the periodogram/fft and find the spike. This should give the frequency and then the period length. If there are small numerical problems, it would still narrow down the range to search with direct comparison. Josef (finding runs sounds like a fun problem, more than multiple tests and comparisons) > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Thu Dec 23 00:24:50 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 23 Dec 2010 00:24:50 -0500 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: References: <42523011.20101222184719@gmail.com> <225079401.20101222211803@gmail.com> <800F4E15-FF35-43F6-A37E-427B67EBC259@gmail.com> Message-ID: On Thu, Dec 23, 2010 at 12:12 AM, wrote: > On Wed, Dec 22, 2010 at 11:49 PM, Skipper Seabold wrote: >> On Wed, Dec 22, 2010 at 6:00 PM, Charles R Harris >> wrote: >>> >>> >>> On Wed, Dec 22, 2010 at 2:51 PM, Robert Kern wrote: >>>> >>>> On Wed, Dec 22, 2010 at 16:47, Paul Anton Letnes >>>> wrote: >>>> > >>>> > On 22. des. 2010, at 21.18, otrov wrote: >>>> > >>>> >>>> The problem: >>>> >> >>>> >>>> I have 2D data sets (scipy/numpy arrays) of 10^7 to 10^8 rows, which >>>> >>>> consists of repeated sequences of one unique sequence, usually ~10^5 rows, >>>> >>>> but may differ in scale. Period is same for both columns, so there is not >>>> >>>> really difference if we consider 2D or 1D array. >>>> >>>> I want to track this data block. >>>> >> >>>> >>> for i in range(1, len(X)-1): >>>> >>> ? ?if (X[i:] == X[:-i]).all(): >>>> >>> ? ? ? ?break >>>> >> >>>> >> Just look at that python beauty! Such a great language when in hand of >>>> >> a smart user. >>>> >> Thanks for you snippet, but unfortunately it takes forever to finish >>>> >> the task >>>> > >>>> > You could also check one element at a time. I think it will be faster, >>>> > because it will break if comparison of the first element doesn't hold. Then, >>>> > if you find such an occurrence, use Robert's method to double check that you >>>> > found the true repetition period. >>>> >>>> Excellent point. >>>> >>> >>> Why not do an FFT and look at the shape around the carrier frequency? The DC >>> level should probably be subtracted first. It shoud also be possible to >>> construct a Weiner filter to extract the sequences if they don't occur with >>> strict periods. >>> >> >> Could you give an example? ?I've used a convolution to find the number >> of successive discrete events, but I'm not sure how to generalize it >> or if your suggestion is similar. >> >> For example, to count the number of three successes in a row for >> Bernoulli trials and to find where >> >> In [1]: import numpy as np >> >> In [2]: x = np.array([1,1,1,0,0,1,0,1,0,1,1,1,0,1,0,1]) >> >> In [3]: y = np.array([1,1,1]) >> >> In [4]: idx = np.convolve(x,y) >> >> In [5]: num_runs = len(np.where(idx==len(y))[0]) >> >> In [6]: # Extract runs from original array >> >> In [6]: idx = np.where(idx==len(y))[0]-(len(y)-1) >> >> In [7]: idx = np.hstack([np.arange(i,i+3) for i in idx]) >> >> In [8]: x[idx] >> Out[8]: array([1, 1, 1, 1, 1, 1]) > > It's different because in the original case you want to find the > periodicity, for example calculate the periodogram/fft and find the > spike. This should give the frequency and then the period length. If > there are small numerical problems, it would still narrow down the > range to search with direct comparison. > > Josef > (finding runs sounds like a fun problem, more than multiple tests and > comparisons) a one liner, just for fun and not relevant for the question >>> x = np.array([1,1,1,0,0,1,0,1,0,1,1,1,0,1,0,1]) >>> np.bincount(np.diff(np.nonzero(np.diff(np.r_[[-np.inf], x, [np.inf]]))[0])) array([0, 8, 1, 2]) >>> (_*np.arange(4)).sum() == len(x) True Josef > >> >> Skipper >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From gmane.comp.python.scientific.user at christian-schmuck.de Wed Dec 22 10:51:15 2010 From: gmane.comp.python.scientific.user at christian-schmuck.de (Christian Schmuck) Date: Wed, 22 Dec 2010 15:51:15 +0000 (UTC) Subject: [SciPy-User] speeding up integrate.odeint with weave/blitz References: Message-ID: Hey, I've been trying the 'for' macro myself. I couldn't get it working though. Here the very simple code example, I tried: #************************************************** from PyDSTool import Generator, args DSargs = args(name='ODEtest') DSargs.varspecs = {'y[i]': 'for(i, 0, 2, -y[i])'} testODE = Generator.Dopri_ODEsystem(DSargs) #************************************************** And here the error: ----------------------------------------------------------------------- KeyError Traceback (most recent call last) /home/schmuck/Arbeit/py_test/t4.py in () 5 DSargs.varspecs = {'y[i]': 'for(i, 0, 2, -y[i])'} 6 ----> 7 testODE = Generator.Dopri_ODEsystem(DSargs) 8 9 /usr/local/lib/python2.6/dist-packages/PyDSTool/Generator /Dopri_ODEsystem.pyc in __init__(self, kw) 331 else: 332 nobuild = False --> 333 ODEsystem.__init__(self, kw) 334 self.diagnostics.outputStatsInfo = { 335 'last_step': 'Predicted step size of the last accepted step (useful for a subsequent call to dop853).', /usr/local/lib/python2.6/dist-packages/PyDSTool/Generator /ODEsystem.py in __init__(self, kw) 73 self.variables[xname] = Variable(indepdomain=tdepdomain, 74 depdomain=Interval(xname, ---> 75 self.xtype[xname], 76 self.xdomain[xname], 77 self._abseps)) KeyError: 'y0' WARNING: Failure executing file: Has anyone ever used the for macro successfully and could he or she give a short, working example? I'm still a newbie with python and pyDSTool but I've done some debugging with winpdb and I've got the sneaky feeling that there is a bug. Thanks, Christian From f.pollastri at inrim.it Thu Dec 23 10:43:28 2010 From: f.pollastri at inrim.it (Fabrizio Pollastri) Date: Thu, 23 Dec 2010 15:43:28 +0000 (UTC) Subject: [SciPy-User] pandas data frame building by outer join Message-ID: Hello, A pandas question: it is possible to build a data frame from several time series, starting with an empty data frame and reading the time series one at a time from a file and joining them in outer mode to the data frame? How I can control the column name of each added time series? Here is a coding example, not working since join wants two data frames. from pandas import * from pandas.io.parsers import parseCSV import sys global_df = DataFrame() for fname in sys.argv[1:]: current_time_series = parseCSV(fname)['col_of_interest'] global_df = global_df.join(current_time_series,how='outer') TIA, Fabrizio Pollastri From wesmckinn at gmail.com Thu Dec 23 11:27:10 2010 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 23 Dec 2010 11:27:10 -0500 Subject: [SciPy-User] pandas data frame building by outer join In-Reply-To: References: Message-ID: On Thu, Dec 23, 2010 at 10:43 AM, Fabrizio Pollastri wrote: > Hello, > > A pandas question: > it is possible to build a data frame from several time series, starting with an > empty data frame and reading the time series one at a time from a file and > joining them in outer mode to the data frame? How I can control the column name > of each added time series? > > Here is a coding example, not working since join wants two data frames. > > from pandas import * > from pandas.io.parsers import parseCSV > import sys > > global_df = DataFrame() > > for fname in sys.argv[1:]: > ? ?current_time_series = parseCSV(fname)['col_of_interest'] > ? ?global_df = global_df.join(current_time_series,how='outer') > > > TIA, > Fabrizio Pollastri > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > A preferable approach (faster and simpler) would be to create a dict of time series and pass that to the DataFrame constructor, e.g. data_dict = {} for fname in sys.argv[1:]: data_dict[fname] = parseCSV(fname)['col_of_interest'] df = DataFrame(data_dict) So the keys of the dict will be the column names. hth, Wes From dejan.org at gmail.com Thu Dec 23 13:59:41 2010 From: dejan.org at gmail.com (otrov) Date: Thu, 23 Dec 2010 19:59:41 +0100 Subject: [SciPy-User] Identify unique sequence data from array In-Reply-To: <800F4E15-FF35-43F6-A37E-427B67EBC259@gmail.com> References: <42523011.20101222184719@gmail.com> <225079401.20101222211803@gmail.com> <800F4E15-FF35-43F6-A37E-427B67EBC259@gmail.com> Message-ID: <165178387.20101223195941@gmail.com> > You could also check one element at a time. I think it will be > faster, because it will break if comparison of the first element > doesn't hold. Then, if you find such an occurrence, use Robert's > method to double check that you found the true repetition period. > Code: >>>> a = [1,2,3,4,1,2,3,4,1,2,3,4] >>>> a = numpy.array(a) >>>> for i in range(1, 1+a.size/2): > ... if (a[0] == a[::i]).all(): print 'period is ',i > ... > ... > period is 4 This works great! For 2D if it's not obvious: for i in range(1, 1+a.size/4): if (a[0,0] == a[::i,0]).all(): do something Thank you From Chris.Barker at noaa.gov Thu Dec 23 15:49:30 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 23 Dec 2010 12:49:30 -0800 Subject: [SciPy-User] OT: calling Java from Python Message-ID: <4D13B5DA.1020205@noaa.gov> Hi folks, This is a bit OT, but y'all tend to be a great resource for lots of stuff, and this is python-for-science related: Are there any active projects supporting calling Java from CPython? It seems JPE and JPype are both pretty dead, and I don't see much else except maybe Babel: https://computation.llnl.gov/casc/components/#page=home -- thought that seems kind of like yet another language!) My use case: Lately Unidata has put a lot more emphasis on the JAVA implementation of the netcdf libraries that the C one, so it has a lot of nifty features not supported for C, and therefore for Python. There is this project for calling the netcdf JAVA libs from MATLAB: http://sourceforge.net/apps/trac/njtbx Which made me think that it would be nice to have something similar for Python. With the Python C api and JNI (and maybe Cython), it should be possible, but I sure don't want to start doing that from scratch. gnu CNI looks promising, too. There is a lot going on in the JAVA world, it would be nice to access some of that. Are any of you doing anything like this? Any thoughts? There -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cgohlke at uci.edu Thu Dec 23 16:00:14 2010 From: cgohlke at uci.edu (Christoph Gohlke) Date: Thu, 23 Dec 2010 13:00:14 -0800 Subject: [SciPy-User] OT: calling Java from Python In-Reply-To: <4D13B5DA.1020205@noaa.gov> References: <4D13B5DA.1020205@noaa.gov> Message-ID: <4D13B85E.7090906@uci.edu> On 12/23/2010 12:49 PM, Christopher Barker wrote: > Hi folks, > > This is a bit OT, but y'all tend to be a great resource for lots of > stuff, and this is python-for-science related: > > Are there any active projects supporting calling Java from CPython? > > It seems JPE and JPype are both pretty dead, and I don't see much else > except maybe Babel: > > https://computation.llnl.gov/casc/components/#page=home > > -- thought that seems kind of like yet another language!) > > My use case: > > Lately Unidata has put a lot more emphasis on the JAVA implementation of > the netcdf libraries that the C one, so it has a lot of nifty features > not supported for C, and therefore for Python. > > There is this project for calling the netcdf JAVA libs from MATLAB: > > http://sourceforge.net/apps/trac/njtbx > > Which made me think that it would be nice to have something similar for > Python. With the Python C api and JNI (and maybe Cython), it should be > possible, but I sure don't want to start doing that from scratch. > > gnu CNI looks promising, too. > > > There is a lot going on in the JAVA world, it would be nice to access > some of that. > > Are any of you doing anything like this? Any thoughts? There > > > -Chris > > CellProfiler calls Java libraries (bioformats, ImageJ) via JNI. Take a look at . -- Christoph From josef.pktd at gmail.com Thu Dec 23 17:00:49 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 23 Dec 2010 17:00:49 -0500 Subject: [SciPy-User] OT: calling Java from Python In-Reply-To: <4D13B85E.7090906@uci.edu> References: <4D13B5DA.1020205@noaa.gov> <4D13B85E.7090906@uci.edu> Message-ID: On Thu, Dec 23, 2010 at 4:00 PM, Christoph Gohlke wrote: > > > On 12/23/2010 12:49 PM, Christopher Barker wrote: >> Hi folks, >> >> This is a bit OT, but y'all tend to be a great resource for lots of >> stuff, and this is python-for-science related: >> >> Are there any active projects supporting calling Java from CPython? >> >> It seems JPE and JPype are both pretty dead, and I don't see much else >> except maybe Babel: >> >> https://computation.llnl.gov/casc/components/#page=home >> >> ? ?-- thought that seems kind of like yet another language!) >> >> My use case: >> >> Lately Unidata has put a lot more emphasis on the JAVA implementation of >> the netcdf libraries that the C one, so it has a lot of nifty features >> not supported for C, and therefore for Python. >> >> There is this project for calling the netcdf JAVA libs from MATLAB: >> >> http://sourceforge.net/apps/trac/njtbx >> >> Which made me think that it would be nice to have something similar for >> Python. With the Python C api and JNI (and maybe Cython), it should be >> possible, but I sure don't want to start doing that from scratch. >> >> gnu CNI looks promising, too. >> >> >> There is a lot going on in the JAVA world, it would be nice to access >> some of that. >> >> Are any of you doing anything like this? Any thoughts? There >> >> >> -Chris >> >> > > CellProfiler calls Java libraries > (bioformats, ImageJ) via JNI. Take a look at > . http://pypi.python.org/pypi/JCC/2.7 Josef > > -- > Christoph > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cpeters at edisonmission.com Thu Dec 23 18:12:06 2010 From: cpeters at edisonmission.com (Christopher Peters) Date: Thu, 23 Dec 2010 18:12:06 -0500 Subject: [SciPy-User] AUTO: Christopher Peters is out of the office (returning 01/10/2011) Message-ID: I am out of the office until 01/10/2011. I am in Australia on my honeymoon. Please refer all questions to Jamil Egdemir. Note: This is an automated response to your message "Re: [SciPy-User] Identify unique sequence data from array" sent on 12/23/2010 1:59:41 PM. This is the only notification you will receive while this person is away. From josef.pktd at gmail.com Fri Dec 24 17:19:21 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 24 Dec 2010 17:19:21 -0500 Subject: [SciPy-User] runstest and distribution of run lengths Message-ID: Does anyone know the distribution of run lengths in a sequence of bernoulli trial? I thought, I can implement a runstest as a quick exercise, but I got (kind of) stuck. I implemented the Wald-Wolfowitz runs test (plus one and two sample versions) according to Wikipedia http://en.wikipedia.org/wiki/Wald%E2%80%93Wolfowitz_runs_test and the SAS manual. This test only looks at the total number of runs, and the SAS manual has both the exact distribution for small samples and the normal approximation for large sample. So, this went ok. But the runstest in the NIST manual and in dataplot, has the entire distribution of run lengths http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm They mention a book, Bradley, 1968, that I don't have, but they don't say what the formulas and distribution for the expected values and standard deviation that they use are. Does anyone have an idea or knows a more easily accessible reference? Josef From mondifero at gmail.com Sat Dec 25 16:53:01 2010 From: mondifero at gmail.com (O) Date: Sat, 25 Dec 2010 19:53:01 -0200 Subject: [SciPy-User] Decorrelation stretch and PCA, Central File Exchange Message-ID: Dear SciPy users, Can you recommend a PCA module for Python? I have multi-variable data and want to find the principal components... I'm also searching for an image decorrelation stretch capability for Python, but I can write that rather quickly myself if I can find or write PCA. (Just the sort of think I would post on a SciPy Central File Exchange wink wink nudge nudge ;-) Merry grav-mass, O -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Dec 25 18:19:27 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 25 Dec 2010 18:19:27 -0500 Subject: [SciPy-User] Decorrelation stretch and PCA, Central File Exchange In-Reply-To: References: Message-ID: On Sat, Dec 25, 2010 at 4:53 PM, O wrote: > > Dear SciPy users, > > Can you recommend a PCA module for Python?? I have multi-variable data and > want to find the principal components... > > I'm also searching for an image decorrelation stretch capability for Python, > but I can write that rather quickly myself if I can find or write PCA. at least matplotlib, mdp, scikits.learn and scikits.statsmodels each have a pca, and there are several versions posted to the mailing list. scipy-user, May 19, "PCA functions" might be a good starting point, maybe Zachary's version (from a quick search in my cookbook, the mailing list ) Josef > > (Just the sort of think I would post on a SciPy Central File Exchange wink > wink nudge nudge? ;-) > > Merry grav-mass, > > O > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From ryotat at gmx.de Sat Dec 25 21:51:46 2010 From: ryotat at gmx.de (Ryota Tomioka) Date: Sun, 26 Dec 2010 03:51:46 +0100 Subject: [SciPy-User] scipy.test() causes segmentation fault for test_lobpcg Message-ID: <20101226025146.315430@gmx.net> Dear Scipy users, I have recently installed numpy 1.5.1rc1 and scipy 0.8.0 on a CentOS 5.5 server. ATLAS was compiled with gfortran and I also specified gfortran for the installation of both numpy and scipy. numpy.test() ran without trouble, but scipy.test() crashed due to segmentation fault and this was in test_lobpcg.test_ElasticRod. In order to reproduce the result I copied /usr/local/lib/python2.6/site-packages/scipy/sparse/linalg/eigen/lobpcg/tests/test_lobpcg.py to my home directory and did the following. [ryotat at cyprus ~]$ python Python 2.6.6 (r266:84292, Nov 19 2010, 22:23:00) [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from test_lobpcg import * >>> A,B=ElasticRod(100) >>> compare_solutions(A,B,100) >>> compare_solutions(A,B,80) >>> compare_solutions(A,B,40) >>> compare_solutions(A,B,30) >>> compare_solutions(A,B,22) >>> compare_solutions(A,B,21) >>> compare_solutions(A,B,20) Segmentation fault So it seems to happen only around m=20. m=10 did not cause segmentation fault but resulted in AssertionError: Arrays are not almost equal To see it in more detail, I tried >>> A,B=ElasticRod(100) >>> m=20 >>> n=A.shape[0] >>> numpy.random.seed(0) >>> V=rand(n,m) >>> X=linalg.orth(V) >>> eigs,vecs=lobpcg(A,X,B=B,tol=1e-5,maxiter=30,verbosityLevel=10) Solving generalized eigenvalue problem with preconditioning matrix size 100 block size 20 No constraints iteration 0 [ True True True True True True True True True True True True True True True True True True True True] current block size: 20 eigenvalue: [ 1.785e+12 1.586e+12 1.356e+12 1.330e+12 1.212e+12 1.155e+12 1.080e+12 9.149e+11 8.272e+11 8.229e+11 7.664e+11 6.941e+11 6.769e+11 5.848e+11 5.553e+11 4.994e+11 4.283e+11 3.813e+11 3.537e+11 1.058e+10] residual norms: [ 7.223e+10 6.780e+10 7.145e+10 7.305e+10 6.290e+10 7.085e+10 6.539e+10 5.466e+10 6.137e+10 5.374e+10 5.809e+10 5.725e+10 5.375e+10 5.334e+10 5.052e+10 4.746e+10 4.176e+10 3.650e+10 3.283e+10 6.905e+09] Segmentation fault Does anyone experienced something similar? Or could anyone suggest where I should look into? Thanks, Ryota -- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 100,- Euro! https://freundschaftswerbung.gmx.de From kwgoodman at gmail.com Mon Dec 27 15:04:04 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 27 Dec 2010 12:04:04 -0800 Subject: [SciPy-User] [ANN] Bottleneck 0.2 Message-ID: Bottleneck is a collection of fast NumPy array functions written in Cython. The second release of Bottleneck is faster, contains more functions, and supports more dtypes. Faster: - All functions faster (less overhead) when output is not a scalar - Faster nanmean() for 2d, 3d arrays containing NaNs when axis is not None New functions: - nanargmin() - nanargmax() - nanmedian, 100X faster than SciPy's nanmedian for (100,100) input, axis=0 Enhancements: - Added support for float32 - Fallback to slower, non-Cython functions for unaccelerated ndim/dtype - Scipy is no longer a dependency - Added support for older versions of NumPy (1.4.1) - All functions are now templated for dtype and axis - Added a sandbox for prototyping of new Bottleneck functions - Rewrote benchmarking code Breaks from 0.1.0: - To run benchmark use bn.bench() instead of bn.benchit() download http://pypi.python.org/pypi/Bottleneck docs http://berkeleyanalytics.com/bottleneck code http://github.com/kwgoodman/bottleneck mailing list http://groups.google.com/group/bottle-neck mailing list 2 http://mail.scipy.org/mailman/listinfo/scipy-user Bottleneck comes with a benchmark suite that compares the performance of the bottleneck functions that have a NumPy/SciPy equivalent. To run the benchmark: >>> bn.bench(mode='fast') Bottleneck performance benchmark Bottleneck 0.2.0 Numpy (np) 1.5.1 Scipy (sp) 0.8.0 Speed is NumPy or SciPy time divided by Bottleneck time NaN means one-third NaNs; axis=0 and float64 are used median vs np.median 3.59 (10,10) 2.43 (1001,1001) 2.28 (1000,1000) 2.16 (100,100) nanmedian vs local copy of sp.stats.nanmedian 102.72 (10,10) NaN 94.34 (10,10) 67.89 (100,100) NaN 28.52 (100,100) 6.37 (1000,1000) NaN 4.41 (1000,1000) nanmax vs np.nanmax 9.99 (100,100) NaN 6.12 (10,10) NaN 5.99 (10,10) 5.88 (100,100) 1.79 (1000,1000) NaN 1.76 (1000,1000) nanmean vs local copy of sp.stats.nanmean 25.95 (100,100) NaN 12.85 (100,100) 12.26 (10,10) NaN 11.89 (10,10) 5.15 (1000,1000) NaN 3.17 (1000,1000) nanstd vs local copy of sp.stats.nanstd 16.96 (100,100) NaN 15.75 (10,10) NaN 15.49 (10,10) 9.51 (100,100) 3.85 (1000,1000) NaN 2.82 (1000,1000) nanargmax vs np.nanargmax 8.60 (100,100) NaN 5.65 (10,10) NaN 5.62 (100,100) 5.44 (10,10) 2.84 (1000,1000) NaN 2.58 (1000,1000) move_nanmean vs sp.ndimage.convolve1d based function window = 5 19.52 (10,10) NaN 18.55 (10,10) 10.56 (100,100) NaN 6.67 (100,100) 5.19 (1000,1000) NaN 4.42 (1000,1000) Under the hood Bottleneck uses a separate Cython function for each combination of ndim, dtype, and axis. A lot of the overhead in bn.nanmax(), for example, is in checking that the axis is within range, converting non-array data to an array, and selecting the function to use to calculate the maximum. You can get rid of the overhead by calling the underlying Cython function directly. Benchmarks for the low-level Cython version of each function: >>> bn.bench(mode='faster') Bottleneck performance benchmark Bottleneck 0.2.0 Numpy (np) 1.5.1 Scipy (sp) 0.8.0 Speed is NumPy or SciPy time divided by Bottleneck time NaN means one-third NaNs; axis=0 and float64 are used median_selector vs np.median 15.29 (10,10) 14.19 (100,100) 8.04 (1001,1001) 7.32 (1000,1000) nanmedian_selector vs local copy of sp.stats.nanmedian 352.08 (10,10) NaN 340.27 (10,10) 185.56 (100,100) NaN 138.81 (100,100) 8.21 (1000,1000) 8.09 (1000,1000) NaN nanmax_selector vs np.nanmax 21.54 (10,10) NaN 19.98 (10,10) 12.65 (100,100) NaN 6.82 (100,100) 1.79 (1000,1000) NaN 1.76 (1000,1000) nanmean_selector vs local copy of sp.stats.nanmean 41.08 (10,10) NaN 39.05 (10,10) 31.74 (100,100) NaN 15.24 (100,100) 5.13 (1000,1000) NaN 3.16 (1000,1000) nanstd_selector vs local copy of sp.stats.nanstd 44.55 (10,10) NaN 43.49 (10,10) 18.66 (100,100) NaN 10.29 (100,100) 3.83 (1000,1000) NaN 2.82 (1000,1000) nanargmax_selector vs np.nanargmax 17.91 (10,10) NaN 17.00 (10,10) 10.56 (100,100) NaN 6.50 (100,100) 2.85 (1000,1000) NaN 2.59 (1000,1000) move_nanmean_selector vs sp.ndimage.convolve1d based function window = 5 55.96 (10,10) NaN 50.82 (10,10) 11.77 (100,100) NaN 6.93 (100,100) 5.56 (1000,1000) NaN 4.51 (1000,1000) From dagss at student.matnat.uio.no Tue Dec 28 08:42:21 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 28 Dec 2010 14:42:21 +0100 Subject: [SciPy-User] [ANN] Bottleneck 0.2 In-Reply-To: References: Message-ID: <4D19E93D.2090903@student.matnat.uio.no> On 12/27/2010 09:04 PM, Keith Goodman wrote: > Bottleneck is a collection of fast NumPy array functions written in Cython. > > The second release of Bottleneck is faster, contains more functions, > and supports more dtypes. > Another special case for you if you want: It seems that you could add the case of "mode='c'" to the array declarations, in the case that the operation goes along the last axis and arr.flags.c_contiguous == True. Dag Sverre > Faster: > - All functions faster (less overhead) when output is not a scalar > - Faster nanmean() for 2d, 3d arrays containing NaNs when axis is not None > > New functions: > - nanargmin() > - nanargmax() > - nanmedian, 100X faster than SciPy's nanmedian for (100,100) input, axis=0 > > Enhancements: > - Added support for float32 > - Fallback to slower, non-Cython functions for unaccelerated ndim/dtype > - Scipy is no longer a dependency > - Added support for older versions of NumPy (1.4.1) > - All functions are now templated for dtype and axis > - Added a sandbox for prototyping of new Bottleneck functions > - Rewrote benchmarking code > > Breaks from 0.1.0: > - To run benchmark use bn.bench() instead of bn.benchit() > > download > http://pypi.python.org/pypi/Bottleneck > docs > http://berkeleyanalytics.com/bottleneck > code > http://github.com/kwgoodman/bottleneck > mailing list > http://groups.google.com/group/bottle-neck > mailing list 2 > http://mail.scipy.org/mailman/listinfo/scipy-user > > Bottleneck comes with a benchmark suite that compares the performance > of the bottleneck functions that have a NumPy/SciPy equivalent. To run > the benchmark: > > >>> bn.bench(mode='fast') > Bottleneck performance benchmark > Bottleneck 0.2.0 > Numpy (np) 1.5.1 > Scipy (sp) 0.8.0 > Speed is NumPy or SciPy time divided by Bottleneck time > NaN means one-third NaNs; axis=0 and float64 are used > median vs np.median > 3.59 (10,10) > 2.43 (1001,1001) > 2.28 (1000,1000) > 2.16 (100,100) > nanmedian vs local copy of sp.stats.nanmedian > 102.72 (10,10) NaN > 94.34 (10,10) > 67.89 (100,100) NaN > 28.52 (100,100) > 6.37 (1000,1000) NaN > 4.41 (1000,1000) > nanmax vs np.nanmax > 9.99 (100,100) NaN > 6.12 (10,10) NaN > 5.99 (10,10) > 5.88 (100,100) > 1.79 (1000,1000) NaN > 1.76 (1000,1000) > nanmean vs local copy of sp.stats.nanmean > 25.95 (100,100) NaN > 12.85 (100,100) > 12.26 (10,10) NaN > 11.89 (10,10) > 5.15 (1000,1000) NaN > 3.17 (1000,1000) > nanstd vs local copy of sp.stats.nanstd > 16.96 (100,100) NaN > 15.75 (10,10) NaN > 15.49 (10,10) > 9.51 (100,100) > 3.85 (1000,1000) NaN > 2.82 (1000,1000) > nanargmax vs np.nanargmax > 8.60 (100,100) NaN > 5.65 (10,10) NaN > 5.62 (100,100) > 5.44 (10,10) > 2.84 (1000,1000) NaN > 2.58 (1000,1000) > move_nanmean vs sp.ndimage.convolve1d based function > window = 5 > 19.52 (10,10) NaN > 18.55 (10,10) > 10.56 (100,100) NaN > 6.67 (100,100) > 5.19 (1000,1000) NaN > 4.42 (1000,1000) > > Under the hood Bottleneck uses a separate Cython function for each > combination of ndim, dtype, and axis. A lot of the overhead in > bn.nanmax(), for example, is in checking that the axis is within > range, converting non-array data to an array, and selecting the > function to use to calculate the maximum. You can get rid of the > overhead by calling the underlying Cython function directly. > > Benchmarks for the low-level Cython version of each function: > > >>> bn.bench(mode='faster') > Bottleneck performance benchmark > Bottleneck 0.2.0 > Numpy (np) 1.5.1 > Scipy (sp) 0.8.0 > Speed is NumPy or SciPy time divided by Bottleneck time > NaN means one-third NaNs; axis=0 and float64 are used > median_selector vs np.median > 15.29 (10,10) > 14.19 (100,100) > 8.04 (1001,1001) > 7.32 (1000,1000) > nanmedian_selector vs local copy of sp.stats.nanmedian > 352.08 (10,10) NaN > 340.27 (10,10) > 185.56 (100,100) NaN > 138.81 (100,100) > 8.21 (1000,1000) > 8.09 (1000,1000) NaN > nanmax_selector vs np.nanmax > 21.54 (10,10) NaN > 19.98 (10,10) > 12.65 (100,100) NaN > 6.82 (100,100) > 1.79 (1000,1000) NaN > 1.76 (1000,1000) > nanmean_selector vs local copy of sp.stats.nanmean > 41.08 (10,10) NaN > 39.05 (10,10) > 31.74 (100,100) NaN > 15.24 (100,100) > 5.13 (1000,1000) NaN > 3.16 (1000,1000) > nanstd_selector vs local copy of sp.stats.nanstd > 44.55 (10,10) NaN > 43.49 (10,10) > 18.66 (100,100) NaN > 10.29 (100,100) > 3.83 (1000,1000) NaN > 2.82 (1000,1000) > nanargmax_selector vs np.nanargmax > 17.91 (10,10) NaN > 17.00 (10,10) > 10.56 (100,100) NaN > 6.50 (100,100) > 2.85 (1000,1000) NaN > 2.59 (1000,1000) > move_nanmean_selector vs sp.ndimage.convolve1d based function > window = 5 > 55.96 (10,10) NaN > 50.82 (10,10) > 11.77 (100,100) NaN > 6.93 (100,100) > 5.56 (1000,1000) NaN > 4.51 (1000,1000) > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kwgoodman at gmail.com Tue Dec 28 11:57:07 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 28 Dec 2010 08:57:07 -0800 Subject: [SciPy-User] [ANN] Bottleneck 0.2 In-Reply-To: <4D19E93D.2090903@student.matnat.uio.no> References: <4D19E93D.2090903@student.matnat.uio.no> Message-ID: On Tue, Dec 28, 2010 at 5:42 AM, Dag Sverre Seljebotn wrote: > On 12/27/2010 09:04 PM, Keith Goodman wrote: >> Bottleneck is a collection of fast NumPy array functions written in Cython. >> >> The second release of Bottleneck is faster, contains more functions, >> and supports more dtypes. >> > > Another special case for you if you want: It seems that you could add > the case of "mode='c'" to the array declarations, in the case that the > operation goes along the last axis and arr.flags.c_contiguous == True. Wow! That works great for large input arrays: >> a = np.random.rand(1000,1000) >> timeit bn.func.nanmean_2d_float64_axis1(a) 1000 loops, best of 3: 1.52 ms per loop >> timeit a.flags.c_contiguous == True; bn.func.nanmean_2d_float64_ccontiguous_axis1(a) 1000 loops, best of 3: 1.18 ms per loop And for medium arrays: >> a = np.random.rand(100,100) >> timeit bn.func.nanmean_2d_float64_axis1(a) 100000 loops, best of 3: 16.3 us per loop >> timeit a.flags.c_contiguous == True; bn.func.nanmean_2d_float64_ccontiguous_axis1(a) 100000 loops, best of 3: 13.3 us per loop But the overhead of checking for c contiguous slows things down for small arrays: >> a = np.random.rand(10,10) >> timeit bn.func.nanmean_2d_float64_axis1(a) 1000000 loops, best of 3: 1.28 us per loop >> timeit a.flags.c_contiguous == True; bn.func.nanmean_2d_float64_ccontiguous_axis1(a) 1000000 loops, best of 3: 1.55 us per loop >> timeit a.flags.c_contiguous == True 1000000 loops, best of 3: 201 ns per loop >> timeit a.flags.c_contiguous 10000000 loops, best of 3: 158 ns per loop Plus I'd have to check if the axis is the last one. That's a big speed up for hand coded functions and large input arrays. But I'm not sure how to take advantage of it for general use functions. One option is to provide the low level functions (like nanmean_2d_float64_ccontiguous_axis1) but not use them in the high-level function nanmean. I tried using mode='c' when initializing the output array. But I did not see any speed difference perhaps because the size of the output array is the square root of the input array size. So I tried it with a non-reducing function: move_nanmean. But I didn't see any speed difference. No idea why. From kwgoodman at gmail.com Tue Dec 28 12:15:18 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 28 Dec 2010 09:15:18 -0800 Subject: [SciPy-User] [ANN] Bottleneck 0.2 In-Reply-To: References: <4D19E93D.2090903@student.matnat.uio.no> Message-ID: On Tue, Dec 28, 2010 at 8:57 AM, Keith Goodman wrote: > On Tue, Dec 28, 2010 at 5:42 AM, Dag Sverre Seljebotn > wrote: >> On 12/27/2010 09:04 PM, Keith Goodman wrote: >>> Bottleneck is a collection of fast NumPy array functions written in Cython. >>> >>> The second release of Bottleneck is faster, contains more functions, >>> and supports more dtypes. >>> >> >> Another special case for you if you want: It seems that you could add >> the case of "mode='c'" to the array declarations, in the case that the >> operation goes along the last axis and arr.flags.c_contiguous == True. > > Wow! That works great for large input arrays: > >>> a = np.random.rand(1000,1000) >>> timeit bn.func.nanmean_2d_float64_axis1(a) > 1000 loops, best of 3: 1.52 ms per loop >>> timeit a.flags.c_contiguous == True; bn.func.nanmean_2d_float64_ccontiguous_axis1(a) > 1000 loops, best of 3: 1.18 ms per loop > > And for medium arrays: > >>> a = np.random.rand(100,100) >>> timeit bn.func.nanmean_2d_float64_axis1(a) > 100000 loops, best of 3: 16.3 us per loop >>> timeit a.flags.c_contiguous == True; bn.func.nanmean_2d_float64_ccontiguous_axis1(a) > 100000 loops, best of 3: 13.3 us per loop > > But the overhead of checking for c contiguous slows things down for > small arrays: > >>> a = np.random.rand(10,10) >>> timeit bn.func.nanmean_2d_float64_axis1(a) > 1000000 loops, best of 3: 1.28 us per loop >>> timeit a.flags.c_contiguous == True; bn.func.nanmean_2d_float64_ccontiguous_axis1(a) > 1000000 loops, best of 3: 1.55 us per loop >>> timeit a.flags.c_contiguous == True > 1000000 loops, best of 3: 201 ns per loop >>> timeit a.flags.c_contiguous > 10000000 loops, best of 3: 158 ns per loop > > Plus I'd have to check if the axis is the last one. > > That's a big speed up for hand coded functions and large input arrays. > But I'm not sure how to take advantage of it for general use > functions. One option is to provide the low level functions (like > nanmean_2d_float64_ccontiguous_axis1) but not use them in the > high-level function nanmean. > > I tried using mode='c' when initializing the output array. But I did > not see any speed difference perhaps because the size of the output > array is the square root of the input array size. So I tried it with a > non-reducing function: move_nanmean. But I didn't see any speed > difference. No idea why. Oh, I don't see a speed difference when I use mode='c' on the input array to move_nanmean. Could it be because the function is constantly switching at each step along the last axis between indexing into the input array and indexing into the output array and in that case contiguous memory doesn't help? From seb.haase at gmail.com Tue Dec 28 17:40:16 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Tue, 28 Dec 2010 23:40:16 +0100 Subject: [SciPy-User] [ANN] Bottleneck 0.2 In-Reply-To: References: <4D19E93D.2090903@student.matnat.uio.no> Message-ID: Congratulations ! What do you mean by "templated functions" -- do you have a way of doing cython template functions now ? - Sebastian On Tue, Dec 28, 2010 at 6:15 PM, Keith Goodman wrote: > On Tue, Dec 28, 2010 at 8:57 AM, Keith Goodman wrote: >> On Tue, Dec 28, 2010 at 5:42 AM, Dag Sverre Seljebotn >> wrote: >>> On 12/27/2010 09:04 PM, Keith Goodman wrote: >>>> Bottleneck is a collection of fast NumPy array functions written in Cython. >>>> >>>> The second release of Bottleneck is faster, contains more functions, >>>> and supports more dtypes. >>>> >>> >>> Another special case for you if you want: It seems that you could add >>> the case of "mode='c'" to the array declarations, in the case that the >>> operation goes along the last axis and arr.flags.c_contiguous == True. >> >> Wow! That works great for large input arrays: >> >>>> a = np.random.rand(1000,1000) >>>> timeit bn.func.nanmean_2d_float64_axis1(a) >> 1000 loops, best of 3: 1.52 ms per loop >>>> timeit a.flags.c_contiguous == True; bn.func.nanmean_2d_float64_ccontiguous_axis1(a) >> 1000 loops, best of 3: 1.18 ms per loop >> >> And for medium arrays: >> >>>> a = np.random.rand(100,100) >>>> timeit bn.func.nanmean_2d_float64_axis1(a) >> 100000 loops, best of 3: 16.3 us per loop >>>> timeit a.flags.c_contiguous == True; bn.func.nanmean_2d_float64_ccontiguous_axis1(a) >> 100000 loops, best of 3: 13.3 us per loop >> >> But the overhead of checking for c contiguous slows things down for >> small arrays: >> >>>> a = np.random.rand(10,10) >>>> timeit bn.func.nanmean_2d_float64_axis1(a) >> 1000000 loops, best of 3: 1.28 us per loop >>>> timeit a.flags.c_contiguous == True; bn.func.nanmean_2d_float64_ccontiguous_axis1(a) >> 1000000 loops, best of 3: 1.55 us per loop >>>> timeit a.flags.c_contiguous == True >> 1000000 loops, best of 3: 201 ns per loop >>>> timeit a.flags.c_contiguous >> 10000000 loops, best of 3: 158 ns per loop >> >> Plus I'd have to check if the axis is the last one. >> >> That's a big speed up for hand coded functions and large input arrays. >> But I'm not sure how to take advantage of it for general use >> functions. One option is to provide the low level functions (like >> nanmean_2d_float64_ccontiguous_axis1) but not use them in the >> high-level function nanmean. >> >> I tried using mode='c' when initializing the output array. But I did >> not see any speed difference perhaps because the size of the output >> array is the square root of the input array size. So I tried it with a >> non-reducing function: move_nanmean. But I didn't see any speed >> difference. No idea why. > > Oh, I don't see a speed difference when I use mode='c' on the input > array to move_nanmean. Could it be because the function is constantly > switching at each step along the last axis between indexing into the > input array and indexing into the output array and in that case > contiguous memory doesn't help? > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kwgoodman at gmail.com Tue Dec 28 17:56:27 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 28 Dec 2010 14:56:27 -0800 Subject: [SciPy-User] [ANN] Bottleneck 0.2 In-Reply-To: References: <4D19E93D.2090903@student.matnat.uio.no> Message-ID: On Tue, Dec 28, 2010 at 2:40 PM, Sebastian Haase wrote: > Congratulations ! ?What do you mean by "templated functions" -- do you > have a way of doing cython template functions now ? Thank you! My hope is that others find it useful. Nothing fancy (or of general use). For example, the nanmax function template: https://github.com/kwgoodman/bottleneck/blob/master/bottleneck/src/template/func/nanmax.py is used to generate the nanmax pyx file: https://github.com/kwgoodman/bottleneck/blob/master/bottleneck/src/func/nanmax.pyx The templating of the axis, for example, is done like this (from the looper function docstring): Make a 3d loop template: >>> loop = ''' .... for iINDEX0 in range(nINDEX0): .... for iINDEX1 in range(nINDEX1): .... amin = MAXDTYPE .... for iINDEX2 in range(nINDEX2): .... ai = a[INDEXALL] .... if ai <= amin: .... amin = ai .... y[INDEXPOP] = amin .... ''' Import the looper function: >>> from bottleneck.src.template.template import looper Make a loop over axis=0: >>> print looper(loop, ndim=3, axis=0) for i1 in range(n1): for i2 in range(n2): amin = MAXDTYPE for i0 in range(n0): ai = a[i0, i1, i2] if ai <= amin: amin = ai y[i1, i2] = amin Make a loop over axis=1: >>> print looper(loop, ndim=3, axis=1) for i0 in range(n0): for i2 in range(n2): amin = MAXDTYPE for i1 in range(n1): ai = a[i0, i1, i2] if ai <= amin: amin = ai y[i0, i2] = amin From jsseabold at gmail.com Wed Dec 29 09:43:05 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 29 Dec 2010 09:43:05 -0500 Subject: [SciPy-User] runstest and distribution of run lengths In-Reply-To: References: Message-ID: On Fri, Dec 24, 2010 at 5:19 PM, wrote: > Does anyone know the distribution of run lengths in a sequence of > bernoulli trial? > > I thought, I can implement a runstest as a quick exercise, but I got > (kind of) stuck. > > I implemented the Wald-Wolfowitz runs test (plus one and two sample > versions) according to Wikipedia > http://en.wikipedia.org/wiki/Wald%E2%80%93Wolfowitz_runs_test > and the SAS manual. This test only looks at the total number of runs, > and the SAS manual has both the exact distribution for small samples > and the normal approximation for large sample. So, this went ok. > > But the runstest in the NIST manual and in dataplot, has the entire > distribution of run lengths > http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm > They mention a book, Bradley, 1968, ?that I don't have, but they don't > say what the formulas and distribution for the expected values and > standard deviation that they use are. > > Does anyone have an idea or knows a more easily accessible reference? > Only other one I've come across (via Stata) Swed, F.S. and Eisenhart, C. 1943. "Tables for testing randomness of grouping in a sequence of alternatives." The Annals of Mathematical Statistics. 14.1, 66-87. http://scholar.google.com/scholar?cluster=3689844222480893877&hl=en&as_sdt=20000 Skipper From yury at shurup.com Thu Dec 30 12:47:32 2010 From: yury at shurup.com (Yury V. Zaytsev) Date: Thu, 30 Dec 2010 18:47:32 +0100 Subject: [SciPy-User] Sometimes fmin_l_bfgs_b tests NaN parameters and then fails to converge Message-ID: <1293731252.6936.47.camel@mypride> Dear Scipy experts, I am implementing a proof of concept code for a bounded optimization problem using Scipy optimization module and particularly fmin_l_bfgs_b bounded optimizer. The problem is that my code runs fine on one machine, but on another one at some point the optimizer passes NaN's as parameter values and then after some time fails to converge normally (it terminates, but with an ABNORMAL_TERMINATION_IN_LNSRCH error message). I am very confused about that and puzzled to which extent I can trust the results if they are so much system dependent. I would appreciate if the developers could tell me whether this problem is reproducible and what is the reason for its occurrence. If I can provide any additional information which would be helpful in diagnosing the issue, please bear with me. An extended description of my setup and machines and some test code to reproduce the problem are presented below: --- I have two machines: 1) Development machine: Ubuntu Hardy / 32 bit Linux mypride 2.6.24-28-generic #1 SMP Wed Nov 24 09:30:14 UTC 2010 i686 GNU/Linux 2) Test machine: Ubuntu Lucid / 64 bit Linux davis 2.6.32-27-generic #49-Ubuntu SMP Thu Dec 2 00:51:09 UTC 2010 x86_64 GNU/Linux The versions on Python and Numpy/Scipy stack on both machines are identical. I run ActiveState Python 2.7.1 installed in a virtualenv, where latest versions of Numpy and Scipy are installed (using pip install numpy / pip install scipy, scipy needs a patch that I fished out of the svn in order to compile). The tests pass on both machines with an exception for nakagami distribution on (1) that I don't care about. The pre-requisites were automatically installed as follows: $ sudo apt-get build-dep python-numpy python-scipy The patch, test logs, test script and optimization logs are attached. Please note, that on machine (1) it does 5 extra steps in between trying out NaN parameters for some reason. In this case, the optimization converges, but it is not always true for my bigger problem. However, you see already the problem of having extra iterations taking time which can lead to problems when the number of parameters is 100+. I have literally compared the numbers that come out of my bigger optimization and they are slightly different, although the difference is like in 10-th significant digit or so, that's why I didn't attribute any special meaning to it. The test script actually implements the simplest function I could think of (it gives inf if x <= 0 and x + 1 if x > 0, so basically the optimizer has to get as close to zero as it can). One obvious difference between the machines is the bitness, but also, the environment on 32-bit machine is older, so this could be a compiler or library problem. That's why I am seeking for confirmation from other users of Scipy possibly running on completely different sets of libraries and compilers. Thanks! -- Sincerely yours, Yury V. Zaytsev -------------- next part -------------- A non-text attachment was scrubbed... Name: scipy-0.8.0-python-2.7.patch Type: text/x-patch Size: 1214 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: machine-1-opt.log Type: text/x-log Size: 1199 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: machine-2-opt.log Type: text/x-log Size: 1034 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: optimizer_test.py Type: text/x-python Size: 328 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: machine-1-tests.log Type: text/x-log Size: 1744 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: machine-2-tests.log Type: text/x-log Size: 1013 bytes Desc: not available URL: From polish at dtgroup.com Thu Dec 30 14:15:09 2010 From: polish at dtgroup.com (Nathaniel Polish) Date: Thu, 30 Dec 2010 14:15:09 -0500 Subject: [SciPy-User] setup question Message-ID: <423192D815B879F4FE0658C5@MORNINGSIDE> I am about to take the plunge into scipy/numpy/python. It seems that python has recently split into 2.x and 3.x versions. Should I be going down the python 3.x path or 2.x path if I am primarily using it for scipy? Are the 2.x/3.x python issues relevant to scipy? I will be running on Ubuntu -- any caveats there? Thanks. From yury at shurup.com Thu Dec 30 14:56:28 2010 From: yury at shurup.com (Yury V. Zaytsev) Date: Thu, 30 Dec 2010 20:56:28 +0100 Subject: [SciPy-User] setup question In-Reply-To: <423192D815B879F4FE0658C5@MORNINGSIDE> References: <423192D815B879F4FE0658C5@MORNINGSIDE> Message-ID: <1293738988.6936.53.camel@mypride> On Thu, 2010-12-30 at 14:15 -0500, Nathaniel Polish wrote: > I am about to take the plunge into scipy/numpy/python. It seems that > python has recently split into 2.x and 3.x versions. Should I be going > down the python 3.x path or 2.x path if I am primarily using it for scipy? > Are the 2.x/3.x python issues relevant to scipy? If you are asking such a question, it means that you need Python 2.x. To make the long story short the libraries transition to py3k is far from being completed and for now if you don't really need py3k features as a matter for life and death you'd better start with 2.x and migrate to py3k later on when it's and you are ready. > I will be running on Ubuntu -- any caveats there? Python 2.x, Numpy, Scipy and whole lots of things come pre-packaged with Ubuntu, so you will only need to install them using Synaptic. Although not the latest versions, obviously. But you don't need latest and greatest if you're just taking off. You will see when you will need more recent versions as you gain more experience. -- Sincerely yours, Yury V. Zaytsev From rob.clewley at gmail.com Thu Dec 30 19:26:35 2010 From: rob.clewley at gmail.com (Rob Clewley) Date: Thu, 30 Dec 2010 19:26:35 -0500 Subject: [SciPy-User] speeding up integrate.odeint with weave/blitz In-Reply-To: References: Message-ID: On Wed, Dec 22, 2010 at 10:51 AM, Christian Schmuck wrote: > Has anyone ever used the for macro successfully and could he or she > give a short, working example? I'm still a newbie with python and > pyDSTool but I've done some debugging with winpdb and I've got > the sneaky feeling that there is a bug. Thanks for pointing this out. This was a bug and it's now fixed on sourceforge. -Rob From pgarrone at optusnet.com.au Fri Dec 31 05:00:42 2010 From: pgarrone at optusnet.com.au (Peter John Garrone) Date: Fri, 31 Dec 2010 21:00:42 +1100 Subject: [SciPy-User] VODE seems too slow for big systems. Message-ID: <20101231100042.GA2828@bacchus> Hi, I am modelling water flow and erosion using a triangular mesh. (I am attempting to develop realistic terrain for a game scenario.) I have developed a model that integrates an ODE using VODE from scipy.integrate, which seems to work the best. Using 4 states per point, the system works for models at a certain scale. I am able to differentiate the model and calculate the banded jacobian accurately, which helps a lot. To make it work faster and better, and looking to putting it on Amazon to calculate models of finer resolution, I made my functional evaluations threaded. However for large models, using no threads, one thread, or two threads made little difference to the calculation rate. Indeed, if running threaded, the two threads that did the function evaluations and were supposed to be occupying my dual-cpu system were instead using only a small fraction of the CPU, and I infer that most of the time was lost in the VODE algorithm. The measurement was made with the ps utility on linux. Looking at the BDF algorithm that VODE employs, I would guess that it intrinsically scales linearly. However as it uses a predictor-corrector step that employs Newtons method that solves a Jacobian expression, I speculate that solving the Jacobian expression for models with tens of thousands of states using direct factorization would be the element taking most of the time. There are iterative solvers in scipy that might solve the Jacobian expression much more quickly. However getting VODE to employ it might be a problem. I wonder if anybody could point me to a better approach here. Peter Garrone From yury at shurup.com Fri Dec 31 13:41:02 2010 From: yury at shurup.com (Yury V. Zaytsev) Date: Fri, 31 Dec 2010 19:41:02 +0100 Subject: [SciPy-User] setup question In-Reply-To: <6C89A1D2DF0843937D90ACAE@[192.168.1.112]> References: <6C89A1D2DF0843937D90ACAE@[192.168.1.112]> Message-ID: <1293820862.6756.35.camel@mypride> Hi! On Fri, 2010-12-31 at 13:31 -0500, Nathaniel Polish wrote: > As a computer scientist my urge is for the latest version but it seems that > there are some big issues in the 2.x to 3.x path. Numpy / Scipy are just not fully py3k-ready yet. Pretty close, but not yet. The situation with other libraries is similar. So if you are (1) just starting (2) right now (and not in a year or two), go for what is available in Ubuntu repositories at the moment. I personally use interpreters from ActiveState and custom builds of Numpy / Scipy in a virtualenv, but the reason for this is that I need to keep the environment consistent among the cluster, desktop and personal computers and also, bugs that I'm hitting were only fixed recently. You'd better get to it when you will realize that you need it yourself. P.S. Please keep the conversation on list. P.P.S. Happy New Year to all the readers! -- Sincerely yours, Yury V. Zaytsev From josef.pktd at gmail.com Fri Dec 31 16:35:01 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 31 Dec 2010 16:35:01 -0500 Subject: [SciPy-User] Sometimes fmin_l_bfgs_b tests NaN parameters and then fails to converge In-Reply-To: <1293731252.6936.47.camel@mypride> References: <1293731252.6936.47.camel@mypride> Message-ID: 2010/12/30 Yury V. Zaytsev : > Dear Scipy experts, > > I am implementing a proof of concept code for a bounded optimization > problem using Scipy optimization module and particularly fmin_l_bfgs_b > bounded optimizer. > > The problem is that my code runs fine on one machine, but on another one > at some point the optimizer passes NaN's as parameter values and then > after some time fails to converge normally (it terminates, but with an > ABNORMAL_TERMINATION_IN_LNSRCH error message). > > I am very confused about that and puzzled to which extent I can trust > the results if they are so much system dependent. I would appreciate if > the developers could tell me whether this problem is reproducible and > what is the reason for its occurrence. If I can provide any additional > information which would be helpful in diagnosing the issue, please bear > with me. > > An extended description of my setup and machines and some test code to > reproduce the problem are presented below: > > --- > > I have two machines: > > 1) Development machine: Ubuntu Hardy / 32 bit > Linux mypride 2.6.24-28-generic #1 SMP Wed Nov 24 09:30:14 UTC 2010 i686 GNU/Linux > > 2) Test machine: Ubuntu Lucid / 64 bit > Linux davis 2.6.32-27-generic #49-Ubuntu SMP Thu Dec 2 00:51:09 UTC 2010 x86_64 GNU/Linux > > The versions on Python and Numpy/Scipy stack on both machines are > identical. > > I run ActiveState Python 2.7.1 installed in a virtualenv, where latest > versions of Numpy and Scipy are installed (using pip install numpy / pip > install scipy, scipy needs a patch that I fished out of the svn in order > to compile). The tests pass on both machines with an exception for > nakagami distribution on (1) that I don't care about. The pre-requisites > were automatically installed as follows: > > $ sudo apt-get build-dep python-numpy python-scipy > > The patch, test logs, test script and optimization logs are attached. > > Please note, that on machine (1) it does 5 extra steps in between trying > out NaN parameters for some reason. In this case, the optimization > converges, but it is not always true for my bigger problem. However, you > see already the problem of having extra iterations taking time which can > lead to problems when the number of parameters is 100+. > > I have literally compared the numbers that come out of my bigger > optimization and they are slightly different, although the difference is > like in 10-th significant digit or so, that's why I didn't attribute any > special meaning to it. > > The test script actually implements the simplest function I could think > of (it gives inf if x <= 0 and x + 1 if x > 0, so basically the > optimizer has to get as close to zero as it can). I get on Windows 32 (array([ 3.03873549e-08]), array([ 1.00000003]), {'warnflag': 0, 'task': 'CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH', 'grad': array([ 0.99999999]), 'funcalls': 16}) But your function has a discontinuity, and I wouldn't expect a bfgs method to produce anything useful since the method assumes smoothness, as far as I know. Josef > > One obvious difference between the machines is the bitness, but also, > the environment on 32-bit machine is older, so this could be a compiler > or library problem. That's why I am seeking for confirmation from other > users of Scipy possibly running on completely different sets of > libraries and compilers. > > Thanks! > > -- > Sincerely yours, > Yury V. Zaytsev > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From yury at shurup.com Fri Dec 31 19:39:35 2010 From: yury at shurup.com (Yury V. Zaytsev) Date: Sat, 01 Jan 2011 01:39:35 +0100 Subject: [SciPy-User] Sometimes fmin_l_bfgs_b tests NaN parameters and then fails to converge In-Reply-To: References: <1293731252.6936.47.camel@mypride> Message-ID: <1293842375.17929.3.camel@mypride> On Fri, 2010-12-31 at 16:35 -0500, josef.pktd at gmail.com wrote: > But your function has a discontinuity, and I wouldn't expect a bfgs > method to produce anything useful since the method assumes smoothness, > as far as I know. You are perfectly right about the discontinuity, but that was not the point. I was rather interested if anyone else is seeing the optimizer trying out NaNs as function parameters as in my case or not... I have this problem with a completely different (smooth and differentiable) function, the test script is just something I came up with without thinking too much to illustrate the problem. Happy New Year! -- Sincerely yours, Yury V. Zaytsev From bioinformed at gmail.com Thu Dec 23 23:34:07 2010 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Thu, 23 Dec 2010 23:34:07 -0500 Subject: [SciPy-User] [Numpy-discussion] ANN: carray 0.3 released In-Reply-To: <201012221958.41105.faltet@pytables.org> References: <201012221958.41105.faltet@pytables.org> Message-ID: On Wed, Dec 22, 2010 at 1:58 PM, Francesc Alted wrote: > >>> %time b = ca.zeros(1e12) > CPU times: user 54.76 s, sys: 0.03 s, total: 54.79 s > Wall time: 55.23 s > I know this is somewhat missing the point of your demonstration, but 55 seconds to create an empty 3 GB data structure to represent a multi-TB dense array doesn't seem all that fast to me. Compression can do a lot of things, but isn't this a case where a true sparse data structure would be the right tool for the job? I'm more interested in seeing what a carray can do with census data, web logs, or somethat vaguely real world where direct binary representations are used by default and assumed to be reasonable optimal (i.e., anything sensibly stored in sqlite tables). -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From willardmaier at gmail.com Wed Dec 22 19:20:27 2010 From: willardmaier at gmail.com (Bill Maier) Date: Thu, 23 Dec 2010 00:20:27 -0000 Subject: [SciPy-User] Building scipy on Linux References: AANLkTin2H6JyD6GzC_wcCVFUimSX1VQ2bpM8hHh8aWCJ@mail.gmail.com Message-ID: <4D1295C6.8000309@gmail.com> Thanks for the reply, Charles. I do use the pre-built Python/numpy/scipy on my system, and it works fine. However I'm trying to get set up to do development work on scipy and numpy. Bill From willardmaier at gmail.com Wed Dec 22 20:47:43 2010 From: willardmaier at gmail.com (Bill Maier) Date: Thu, 23 Dec 2010 01:47:43 -0000 Subject: [SciPy-User] Building scipy on Linux References: AANLkTin2H6JyD6GzC_wcCVFUimSX1VQ2bpM8hHh8aWCJ@mail.gmail.com Message-ID: <4D12AA38.8010006@gmail.com> I do use the prebuilt binaries on my machine, and they work fine. However right now I'm trying to get set up to do development work. From gagneja2000 at gmail.com Wed Dec 29 13:29:52 2010 From: gagneja2000 at gmail.com (aashish gagneja) Date: Wed, 29 Dec 2010 10:29:52 -0800 (PST) Subject: [SciPy-User] genetic algorithm Message-ID: <177c69c0-2755-4f84-a981-a860bd0bb6a0@v17g2000prc.googlegroups.com> hi i need your kind attention and help for my problem i hav to design FIR filter using genetic algortihm in Matlab,kindly help me to design such filter using GAtoolbox or by Matlab programming or by C++ programming,in my proposed filter power consumption reduces as hamming distance (signal toggling between bits)is reduced,also mean square error is reduced ,so overall fitness function is F = w1fM + w2fH where fM and fH are fitness function due to mean sq error and error due to hamming distance and w1,w2 are weights such that w1+w2 =1 also F = 1/1+ Etot where Etot is total error KINDLY GUIDE ME HOW TO USE GADS TOOLBOX FOR THIS,ie OBJECTIVE FUNCTION,LINEAR CONSTRAINTS ETC REGDS gagneja2000 at gmail.com