From madsmh at gmail.com Wed Feb 1 06:56:06 2012 From: madsmh at gmail.com (Mads M. Hansen) Date: Wed, 1 Feb 2012 12:56:06 +0100 Subject: [SciPy-User] NumPy and SciPy test failures Message-ID: I have built NumPy 1.6.1 and SciPy 0.10.0 for Python 3.2 on a Fedora 16 system and I used gfortran, but when I run the tests I get the following failures and errors NumPy: ====================================================================== FAIL: test_kind.TestKind.test_all ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/lib64/python3.2/site-packages/numpy/f2py/tests/test_kind.py", line 30, in test_all 'selectedrealkind(%s): expected %r but got %r' % (i, selected_real_kind(i), selectedrealkind(i))) File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: selectedrealkind(19): expected -1 but got 16 ====================================================================== FAIL: test_doctests (test_polynomial.TestDocs) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python3.2/site-packages/numpy/lib/tests/test_polynomial.py", line 84, in test_doctests return rundocs() File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py", line 988, in rundocs raise AssertionError("Some doctests failed:\n%s" % "\n".join(msg)) AssertionError: Some doctests failed: ********************************************************************** File "/usr/lib64/python3.2/site-packages/numpy/lib/tests/test_polynomial.py", line 32, in test_polynomial Failed example: p / q Expected: (poly1d([ 0.33333333]), poly1d([ 1.33333333, 2.66666667])) Got: (poly1d([ 0.333]), poly1d([ 1.333, 2.667])) ********************************************************************** File "/usr/lib64/python3.2/site-packages/numpy/lib/tests/test_polynomial.py", line 54, in test_polynomial Failed example: p.integ() Expected: poly1d([ 0.33333333, 1. , 3. , 0. ]) Got: poly1d([ 0.333, 1. , 3. , 0. ]) ********************************************************************** File "/usr/lib64/python3.2/site-packages/numpy/lib/tests/test_polynomial.py", line 56, in test_polynomial Failed example: p.integ(1) Expected: poly1d([ 0.33333333, 1. , 3. , 0. ]) Got: poly1d([ 0.333, 1. , 3. , 0. ]) ********************************************************************** File "/usr/lib64/python3.2/site-packages/numpy/lib/tests/test_polynomial.py", line 58, in test_polynomial Failed example: p.integ(5) Expected: poly1d([ 0.00039683, 0.00277778, 0.025 , 0. , 0. , 0. , 0. , 0. ]) Got: poly1d([ 0. , 0.003, 0.025, 0. , 0. , 0. , 0. , 0. ]) ----------------------------------------------------------------------: And SciPy: ====================================================================== ERROR: Failure: ImportError (cannot import name _minimize_neldermead) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.2/site-packages/nose/failure.py", line 37, in runTest raise self.exc_class(self.exc_val).with_traceback(self.tb) File "/usr/lib/python3.2/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/usr/lib/python3.2/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/usr/lib/python3.2/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python3.2/site-packages/scipy/optimize/tests/test_anneal.py", line 10, in from scipy.optimize import anneal, minimize File "/usr/lib64/python3.2/site-packages/scipy/optimize/minimize.py", line 16, in from .optimize import _minimize_neldermead, _minimize_powell, \ ImportError: cannot import name _minimize_neldermead ====================================================================== ERROR: Failure: ImportError (cannot import name cwt) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.2/site-packages/nose/failure.py", line 37, in runTest raise self.exc_class(self.exc_val).with_traceback(self.tb) File "/usr/lib/python3.2/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/usr/lib/python3.2/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/usr/lib/python3.2/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python3.2/site-packages/scipy/signal/tests/test_peak_finding.py", line 7, in from scipy.signal._peak_finding import argrelmax, find_peaks_cwt, _identify_ridge_lines File "/usr/lib64/python3.2/site-packages/scipy/signal/_peak_finding.py", line 7, in from scipy.signal.wavelets import cwt, ricker ImportError: cannot import name cwt ====================================================================== ERROR: test_iv_cephes_vs_amos_mass_test (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python3.2/site-packages/scipy/special/tests/test_basic.py", line 1642, in test_iv_cephes_vs_amos_mass_test c1 = special.iv(v, x) RuntimeWarning: divide by zero encountered in iv ====================================================================== ERROR: test_continuous_extra.test_cont_extra(, (0.4141193182605212,), 'loggamma loc, scale test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_continuous_extra.py", line 78, in check_loc_scale m,v = distfn.stats(*arg) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 1632, in stats mu = self._munp(1.0,*goodargs) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 4120, in _munp return self._mom0_sc(n,*args) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 1166, in _mom0_sc self.b, args=(m,)+args)[0] File "/usr/lib64/python3.2/site-packages/scipy/integrate/quadpack.py", line 247, in quad retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points) File "/usr/lib64/python3.2/site-packages/scipy/integrate/quadpack.py", line 314, in _quad return _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel,limit) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 1163, in _mom_integ0 return x**m * self.pdf(x,*args) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 1263, in pdf place(output,cond,self._pdf(*goodargs) / scale) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 4113, in _pdf return exp(c*x-exp(x)-gamln(c)) RuntimeWarning: overflow encountered in exp ====================================================================== ERROR: test_continuous_extra.test_cont_extra(, (1.8771398388773268,), 'lomax loc, scale test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_continuous_extra.py", line 78, in check_loc_scale m,v = distfn.stats(*arg) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 1618, in stats mu, mu2, g1, g2 = self._stats(*args) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 4644, in _stats mu, mu2, g1, g2 = pareto.stats(c, loc=-1.0, moments='mvsk') File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 1616, in stats mu, mu2, g1, g2 = self._stats(*args,**{'moments':moments}) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 4595, in _stats vals = 2*(bt+1.0)*sqrt(b-2.0)/((b-3.0)*sqrt(b)) RuntimeWarning: invalid value encountered in sqrt ====================================================================== ERROR: test_discrete_basic.test_discrete_extra(, (30, 12, 6), 'hypergeom entropy nan test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_discrete_basic.py", line 199, in check_entropy ent = distfn.entropy(*arg) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 6315, in entropy place(output,cond0,self.vecentropy(*goodargs)) File "/usr/lib64/python3.2/site-packages/numpy/lib/function_base.py", line 1863, in __call__ theout = self.thefunc(*newargs) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 6669, in _entropy lvals = where(vals==0.0,0.0,log(vals)) RuntimeWarning: divide by zero encountered in log ====================================================================== ERROR: test_discrete_basic.test_discrete_extra(, (21, 3, 12), 'hypergeom entropy nan test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_discrete_basic.py", line 199, in check_entropy ent = distfn.entropy(*arg) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 6315, in entropy place(output,cond0,self.vecentropy(*goodargs)) File "/usr/lib64/python3.2/site-packages/numpy/lib/function_base.py", line 1863, in __call__ theout = self.thefunc(*newargs) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 6669, in _entropy lvals = where(vals==0.0,0.0,log(vals)) RuntimeWarning: divide by zero encountered in log ====================================================================== ERROR: test_fit (test_distributions.TestFitMethod) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_distributions.py", line 439, in test_fit vals2 = distfunc.fit(res, optimizer='powell') File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 1875, in fit vals = optimizer(func,x0,args=(ravel(data),),disp=0) File "/usr/lib64/python3.2/site-packages/scipy/optimize/optimize.py", line 1622, in fmin_powell fval, x, direc1 = _linesearch_powell(func, x, direc1, tol=xtol*100) File "/usr/lib64/python3.2/site-packages/scipy/optimize/optimize.py", line 1492, in _linesearch_powell alpha_min, fret, iter, num = brent(myfunc, full_output=1, tol=tol) File "/usr/lib64/python3.2/site-packages/scipy/optimize/optimize.py", line 1313, in brent brent.optimize() File "/usr/lib64/python3.2/site-packages/scipy/optimize/optimize.py", line 1214, in optimize tmp2 = (x-v)*(fx-fw) RuntimeWarning: invalid value encountered in double_scalars ====================================================================== ERROR: test_fix_fit (test_distributions.TestFitMethod) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_distributions.py", line 460, in test_fix_fit vals2 = distfunc.fit(res,fscale=1) File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", line 1875, in fit vals = optimizer(func,x0,args=(ravel(data),),disp=0) File "/usr/lib64/python3.2/site-packages/scipy/optimize/optimize.py", line 302, in fmin and max(abs(fsim[0]-fsim[1:])) <= ftol): RuntimeWarning: invalid value encountered in subtract ====================================================================== ERROR: Failure: ImportError (cannot import name common_info) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.2/site-packages/nose/failure.py", line 37, in runTest raise self.exc_class(self.exc_val).with_traceback(self.tb) File "/usr/lib/python3.2/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/usr/lib/python3.2/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/usr/lib/python3.2/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python3.2/site-packages/scipy/weave/__init__.py", line 26, in from .inline_tools import inline File "/usr/lib64/python3.2/site-packages/scipy/weave/inline_tools.py", line 5, in from . import ext_tools File "/usr/lib64/python3.2/site-packages/scipy/weave/ext_tools.py", line 7, in from . import converters File "/usr/lib64/python3.2/site-packages/scipy/weave/converters.py", line 4, in from . import common_info ImportError: cannot import name common_info ====================================================================== FAIL: test_mio.test_mat4_3d ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/lib64/python3.2/site-packages/scipy/io/matlab/tests/test_mio.py", line 740, in test_mat4_3d stream, {'a': arr}, True, '4') File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py", line 1008, in assert_raises return nose.tools.assert_raises(*args,**kwargs) AssertionError: DeprecationWarning not raised by functools.partial(, oned_as='row') ====================================================================== FAIL: Regression test for #651: better handling of badly conditioned ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python3.2/site-packages/scipy/signal/tests/test_filter_design.py", line 34, in test_bad_filter assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0]) File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py", line 1008, in assert_raises return nose.tools.assert_raises(*args,**kwargs) AssertionError: BadCoefficients not raised by tf2zpk ---------------------------------------------------------------------- The NumPy errors seem to be mostly rounding errors, but it seems to round quite aggressively. How siginificant are these errors? .. Mads From nouiz at nouiz.org Wed Feb 1 07:44:56 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Wed, 1 Feb 2012 07:44:56 -0500 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: It will be slow, but you can make a python loop. Fred On Jan 31, 2012 3:34 PM, "Alexander Kalinin" wrote: > Hello! > > I use SciPy in computer graphics applications. My task is to calculate > vertex normals by averaging faces normals. In other words I want to > accumulate vectors with the same ids. For example, > > ids = numpy.array([0, 1, 1, 2]) > n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, > 0.1 0.1] ]) > > I need result: > nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) > > The most simple code: > nv[ids] += n > does not work, I know about this. For 1D arrays I use numpy.bincount(...) > function. But this function does not work for 2D arrays. > > So, my question. What is the best way calculate accumulation sum for 2D > arrays using indirect indexes? > > Sincerely, > Alexander > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Feb 1 09:01:11 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 1 Feb 2012 07:01:11 -0700 Subject: [SciPy-User] NumPy and SciPy test failures In-Reply-To: References: Message-ID: On Wed, Feb 1, 2012 at 4:56 AM, Mads M. Hansen wrote: > I have built NumPy 1.6.1 and SciPy 0.10.0 for Python 3.2 on a Fedora > 16 system and I used gfortran, but when I run the tests I get the > following failures and errors > > NumPy: > > ====================================================================== > FAIL: test_kind.TestKind.test_all > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest > self.test(*self.arg) > File "/usr/lib64/python3.2/site-packages/numpy/f2py/tests/test_kind.py", > line 30, in test_all > 'selectedrealkind(%s): expected %r but got %r' % (i, > selected_real_kind(i), selectedrealkind(i))) > File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py", > line 34, in assert_ > raise AssertionError(msg) > AssertionError: selectedrealkind(19): expected -1 but got 16 > > I think this is a bug in the test that comes from adding the float16 type. > ====================================================================== > FAIL: test_doctests (test_polynomial.TestDocs) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/lib64/python3.2/site-packages/numpy/lib/tests/test_polynomial.py", > line 84, in test_doctests > return rundocs() > File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py", > line 988, in rundocs > raise AssertionError("Some doctests failed:\n%s" % "\n".join(msg)) > AssertionError: Some doctests failed: > ********************************************************************** > File > "/usr/lib64/python3.2/site-packages/numpy/lib/tests/test_polynomial.py", > line 32, in test_polynomial > Failed example: > p / q > Expected: > (poly1d([ 0.33333333]), poly1d([ 1.33333333, 2.66666667])) > Got: > (poly1d([ 0.333]), poly1d([ 1.333, 2.667])) > > ********************************************************************** > File > "/usr/lib64/python3.2/site-packages/numpy/lib/tests/test_polynomial.py", > line 54, in test_polynomial > Failed example: > p.integ() > Expected: > poly1d([ 0.33333333, 1. , 3. , 0. ]) > Got: > poly1d([ 0.333, 1. , 3. , 0. ]) > > ********************************************************************** > File > "/usr/lib64/python3.2/site-packages/numpy/lib/tests/test_polynomial.py", > line 56, in test_polynomial > Failed example: > p.integ(1) > Expected: > poly1d([ 0.33333333, 1. , 3. , 0. ]) > Got: > poly1d([ 0.333, 1. , 3. , 0. ]) > > ********************************************************************** > File > "/usr/lib64/python3.2/site-packages/numpy/lib/tests/test_polynomial.py", > line 58, in test_polynomial > Failed example: > p.integ(5) > Expected: > poly1d([ 0.00039683, 0.00277778, 0.025 , 0. , 0. , > 0. , 0. , 0. ]) > Got: > poly1d([ 0. , 0.003, 0.025, 0. , 0. , 0. , 0. , 0. ]) > > > ----------------------------------------------------------------------: > > And SciPy: > > ====================================================================== > ERROR: Failure: ImportError (cannot import name _minimize_neldermead) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/failure.py", line 37, in > runTest > raise self.exc_class(self.exc_val).with_traceback(self.tb) > File "/usr/lib/python3.2/site-packages/nose/loader.py", line 390, in > loadTestsFromName > addr.filename, addr.module) > File "/usr/lib/python3.2/site-packages/nose/importer.py", line 39, > in importFromPath > return self.importFromDir(dir_path, fqname) > File "/usr/lib/python3.2/site-packages/nose/importer.py", line 86, > in importFromDir > mod = load_module(part_fqname, fh, filename, desc) > File > "/usr/lib64/python3.2/site-packages/scipy/optimize/tests/test_anneal.py", > line 10, in > from scipy.optimize import anneal, minimize > File "/usr/lib64/python3.2/site-packages/scipy/optimize/minimize.py", > line 16, in > from .optimize import _minimize_neldermead, _minimize_powell, \ > ImportError: cannot import name _minimize_neldermead > > ====================================================================== > ERROR: Failure: ImportError (cannot import name cwt) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/failure.py", line 37, in > runTest > raise self.exc_class(self.exc_val).with_traceback(self.tb) > File "/usr/lib/python3.2/site-packages/nose/loader.py", line 390, in > loadTestsFromName > addr.filename, addr.module) > File "/usr/lib/python3.2/site-packages/nose/importer.py", line 39, > in importFromPath > return self.importFromDir(dir_path, fqname) > File "/usr/lib/python3.2/site-packages/nose/importer.py", line 86, > in importFromDir > mod = load_module(part_fqname, fh, filename, desc) > File > "/usr/lib64/python3.2/site-packages/scipy/signal/tests/test_peak_finding.py", > line 7, in > from scipy.signal._peak_finding import argrelmax, find_peaks_cwt, > _identify_ridge_lines > File "/usr/lib64/python3.2/site-packages/scipy/signal/_peak_finding.py", > line 7, in > from scipy.signal.wavelets import cwt, ricker > ImportError: cannot import name cwt > > ====================================================================== > ERROR: test_iv_cephes_vs_amos_mass_test (test_basic.TestBessel) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/lib64/python3.2/site-packages/scipy/special/tests/test_basic.py", > line 1642, in test_iv_cephes_vs_amos_mass_test > c1 = special.iv(v, x) > RuntimeWarning: divide by zero encountered in iv > > ====================================================================== > ERROR: > test_continuous_extra.test_cont_extra( object at 0x6b38f10>, (0.4141193182605212,), 'loggamma loc, scale > test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest > self.test(*self.arg) > File > "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_continuous_extra.py", > line 78, in check_loc_scale > m,v = distfn.stats(*arg) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 1632, in stats > mu = self._munp(1.0,*goodargs) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 4120, in _munp > return self._mom0_sc(n,*args) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 1166, in _mom0_sc > self.b, args=(m,)+args)[0] > File "/usr/lib64/python3.2/site-packages/scipy/integrate/quadpack.py", > line 247, in quad > retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points) > File "/usr/lib64/python3.2/site-packages/scipy/integrate/quadpack.py", > line 314, in _quad > return > _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel,limit) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 1163, in _mom_integ0 > return x**m * self.pdf(x,*args) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 1263, in pdf > place(output,cond,self._pdf(*goodargs) / scale) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 4113, in _pdf > return exp(c*x-exp(x)-gamln(c)) > RuntimeWarning: overflow encountered in exp > > ====================================================================== > ERROR: > test_continuous_extra.test_cont_extra( object at 0x74ff610>, (1.8771398388773268,), 'lomax loc, scale test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest > self.test(*self.arg) > File > "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_continuous_extra.py", > line 78, in check_loc_scale > m,v = distfn.stats(*arg) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 1618, in stats > mu, mu2, g1, g2 = self._stats(*args) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 4644, in _stats > mu, mu2, g1, g2 = pareto.stats(c, loc=-1.0, moments='mvsk') > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 1616, in stats > mu, mu2, g1, g2 = self._stats(*args,**{'moments':moments}) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 4595, in _stats > vals = 2*(bt+1.0)*sqrt(b-2.0)/((b-3.0)*sqrt(b)) > RuntimeWarning: invalid value encountered in sqrt > > ====================================================================== > ERROR: > test_discrete_basic.test_discrete_extra( object at 0x6c9f510>, (30, 12, 6), 'hypergeom entropy nan test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest > self.test(*self.arg) > File > "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_discrete_basic.py", > line 199, in check_entropy > ent = distfn.entropy(*arg) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 6315, in entropy > place(output,cond0,self.vecentropy(*goodargs)) > File "/usr/lib64/python3.2/site-packages/numpy/lib/function_base.py", > line 1863, in __call__ > theout = self.thefunc(*newargs) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 6669, in _entropy > lvals = where(vals==0.0,0.0,log(vals)) > RuntimeWarning: divide by zero encountered in log > > ====================================================================== > ERROR: > test_discrete_basic.test_discrete_extra( object at 0x6c9f510>, (21, 3, 12), 'hypergeom entropy nan test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest > self.test(*self.arg) > File > "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_discrete_basic.py", > line 199, in check_entropy > ent = distfn.entropy(*arg) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 6315, in entropy > place(output,cond0,self.vecentropy(*goodargs)) > File "/usr/lib64/python3.2/site-packages/numpy/lib/function_base.py", > line 1863, in __call__ > theout = self.thefunc(*newargs) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 6669, in _entropy > lvals = where(vals==0.0,0.0,log(vals)) > RuntimeWarning: divide by zero encountered in log > > ====================================================================== > ERROR: test_fit (test_distributions.TestFitMethod) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_distributions.py", > line 439, in test_fit > vals2 = distfunc.fit(res, optimizer='powell') > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 1875, in fit > vals = optimizer(func,x0,args=(ravel(data),),disp=0) > File "/usr/lib64/python3.2/site-packages/scipy/optimize/optimize.py", > line 1622, in fmin_powell > fval, x, direc1 = _linesearch_powell(func, x, direc1, tol=xtol*100) > File "/usr/lib64/python3.2/site-packages/scipy/optimize/optimize.py", > line 1492, in _linesearch_powell > alpha_min, fret, iter, num = brent(myfunc, full_output=1, tol=tol) > File "/usr/lib64/python3.2/site-packages/scipy/optimize/optimize.py", > line 1313, in brent > brent.optimize() > File "/usr/lib64/python3.2/site-packages/scipy/optimize/optimize.py", > line 1214, in optimize > tmp2 = (x-v)*(fx-fw) > RuntimeWarning: invalid value encountered in double_scalars > > ====================================================================== > ERROR: test_fix_fit (test_distributions.TestFitMethod) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/lib64/python3.2/site-packages/scipy/stats/tests/test_distributions.py", > line 460, in test_fix_fit > vals2 = distfunc.fit(res,fscale=1) > File "/usr/lib64/python3.2/site-packages/scipy/stats/distributions.py", > line 1875, in fit > vals = optimizer(func,x0,args=(ravel(data),),disp=0) > File "/usr/lib64/python3.2/site-packages/scipy/optimize/optimize.py", > line 302, in fmin > and max(abs(fsim[0]-fsim[1:])) <= ftol): > RuntimeWarning: invalid value encountered in subtract > > ====================================================================== > ERROR: Failure: ImportError (cannot import name common_info) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/failure.py", line 37, in > runTest > raise self.exc_class(self.exc_val).with_traceback(self.tb) > File "/usr/lib/python3.2/site-packages/nose/loader.py", line 390, in > loadTestsFromName > addr.filename, addr.module) > File "/usr/lib/python3.2/site-packages/nose/importer.py", line 39, > in importFromPath > return self.importFromDir(dir_path, fqname) > File "/usr/lib/python3.2/site-packages/nose/importer.py", line 86, > in importFromDir > mod = load_module(part_fqname, fh, filename, desc) > File "/usr/lib64/python3.2/site-packages/scipy/weave/__init__.py", > line 26, in > from .inline_tools import inline > File "/usr/lib64/python3.2/site-packages/scipy/weave/inline_tools.py", > line 5, in > from . import ext_tools > File "/usr/lib64/python3.2/site-packages/scipy/weave/ext_tools.py", > line 7, in > from . import converters > File "/usr/lib64/python3.2/site-packages/scipy/weave/converters.py", > line 4, in > from . import common_info > ImportError: cannot import name common_info > > ====================================================================== > FAIL: test_mio.test_mat4_3d > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/case.py", line 198, in runTest > self.test(*self.arg) > File > "/usr/lib64/python3.2/site-packages/scipy/io/matlab/tests/test_mio.py", > line 740, in test_mat4_3d > stream, {'a': arr}, True, '4') > File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py", > line 1008, in assert_raises > return nose.tools.assert_raises(*args,**kwargs) > AssertionError: DeprecationWarning not raised by > functools.partial(, oned_as='row') > > ====================================================================== > FAIL: Regression test for #651: better handling of badly conditioned > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/lib64/python3.2/site-packages/scipy/signal/tests/test_filter_design.py", > line 34, in test_bad_filter > assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0]) > File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py", > line 1008, in assert_raises > return nose.tools.assert_raises(*args,**kwargs) > AssertionError: BadCoefficients not raised by tf2zpk > > ---------------------------------------------------------------------- > > The NumPy errors seem to be mostly rounding errors, but it seems to > round quite aggressively. How siginificant are these errors? > > .. Mads > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dlaxalde at gmail.com Wed Feb 1 10:37:54 2012 From: dlaxalde at gmail.com (Denis Laxalde) Date: Wed, 1 Feb 2012 10:37:54 -0500 Subject: [SciPy-User] NumPy and SciPy test failures In-Reply-To: References: Message-ID: <20120201103754.63be48ec@mcgill.ca> Mads M. Hansen wrote: > I have built NumPy 1.6.1 and SciPy 0.10.0 for Python 3.2 on a Fedora > 16 system and I used gfortran, but when I run the tests I get the > following failures and errors It's probably not scipy 0.10.0. Could you specify the exact versions of you have installed (e.g. from the header displayed by scipy tests)? > And SciPy: > > ====================================================================== > ERROR: Failure: ImportError (cannot import name _minimize_neldermead) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/failure.py", line 37, in runTest > raise self.exc_class(self.exc_val).with_traceback(self.tb) > File "/usr/lib/python3.2/site-packages/nose/loader.py", line 390, in > loadTestsFromName > addr.filename, addr.module) > File "/usr/lib/python3.2/site-packages/nose/importer.py", line 39, > in importFromPath > return self.importFromDir(dir_path, fqname) > File "/usr/lib/python3.2/site-packages/nose/importer.py", line 86, > in importFromDir > mod = load_module(part_fqname, fh, filename, desc) > File "/usr/lib64/python3.2/site-packages/scipy/optimize/tests/test_anneal.py", > line 10, in > from scipy.optimize import anneal, minimize > File "/usr/lib64/python3.2/site-packages/scipy/optimize/minimize.py", > line 16, in > from .optimize import _minimize_neldermead, _minimize_powell, \ > ImportError: cannot import name _minimize_neldermead I was interested by this one but cannot reproduce it with current master on python 3.2.2. -- Denis From glen at toadhill.net Wed Feb 1 11:16:37 2012 From: glen at toadhill.net (glen at toadhill.net) Date: Wed, 1 Feb 2012 11:16:37 -0500 Subject: [SciPy-User] asarray_chkfinite Message-ID: <9394EDC3-D458-4697-86AC-75115C14AA62@toadhill.net> Hi all, I'm trying to optimize some code that entails a very large number of sparse matrix-vector and vctor-vector multiplies. Upon running the profiler I see that about 25% of my program's cumulative time is spent running asarray_chkfinite. I do not call this routine directly. Can anyone tell me what might be calling it and whether there is anything obvious I can do about it? Glen Glen Henshaw, PhD ? Roboticist ? U.S. Naval Research Laboratory office: 202-767-1196 ? google voice/mobile: 443-295-3050 glen.henshaw at nrl.navy.mil -------------- next part -------------- An HTML attachment was scrubbed... URL: From madsmh at gmail.com Wed Feb 1 11:17:43 2012 From: madsmh at gmail.com (Mads M. Hansen) Date: Wed, 1 Feb 2012 17:17:43 +0100 Subject: [SciPy-User] NumPy and SciPy test failures In-Reply-To: <20120201103754.63be48ec@mcgill.ca> References: <20120201103754.63be48ec@mcgill.ca> Message-ID: 2012/2/1 Denis Laxalde : > Mads M. Hansen wrote: >> I have built NumPy 1.6.1 and SciPy 0.10.0 for Python 3.2 on a Fedora >> 16 system and I used gfortran, but when I run the tests I get the >> following failures and errors > > It's probably not scipy 0.10.0. Could you specify the exact versions of > you have installed (e.g. from the header displayed by scipy tests)? > Here is the header >>> scipy.test('full') Running unit tests for scipy NumPy version 1.6.1 NumPy is installed in /usr/lib64/python3.2/site-packages/numpy SciPy version 0.10.0 SciPy is installed in /usr/lib64/python3.2/site-packages/scipy Python version 3.2.1 (default, Jul 11 2011, 18:54:42) [GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] nose version 1.1.2 I checked out the v.0.10.0 tag from the Git repository. .. Mads From lafont.fabien at gmail.com Wed Feb 1 11:21:08 2012 From: lafont.fabien at gmail.com (Fabien Lafont) Date: Wed, 1 Feb 2012 17:21:08 +0100 Subject: [SciPy-User] [scipy-user] How to add a 1D np.array to another np.array? Message-ID: Hello everyone, I try to add an array to another (to build a 2D array). I try that a = np.zeros(2) b=np.array([2,4]) a[[0]] = b print a [2,0] And I want a= [[2,4],0] How can I do? Thx Fabien From alec.kalinin at gmail.com Wed Feb 1 11:34:12 2012 From: alec.kalinin at gmail.com (Alexander Kalinin) Date: Wed, 1 Feb 2012 20:34:12 +0400 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: Yes, but for large data sets loops is quite slow. I have tried Pandas groupby.sum() and it works faster. 2012/2/1 Fr?d?ric Bastien > It will be slow, but you can make a python loop. > > Fred > On Jan 31, 2012 3:34 PM, "Alexander Kalinin" > wrote: > >> Hello! >> >> I use SciPy in computer graphics applications. My task is to calculate >> vertex normals by averaging faces normals. In other words I want to >> accumulate vectors with the same ids. For example, >> >> ids = numpy.array([0, 1, 1, 2]) >> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], >> [0.1, 0.1 0.1] ]) >> >> I need result: >> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) >> >> The most simple code: >> nv[ids] += n >> does not work, I know about this. For 1D arrays I use numpy.bincount(...) >> function. But this function does not work for 2D arrays. >> >> So, my question. What is the best way calculate accumulation sum for 2D >> arrays using indirect indexes? >> >> Sincerely, >> Alexander >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Wed Feb 1 11:35:25 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 1 Feb 2012 10:35:25 -0600 Subject: [SciPy-User] [scipy-user] How to add a 1D np.array to another np.array? In-Reply-To: References: Message-ID: On Wed, Feb 1, 2012 at 10:21 AM, Fabien Lafont wrote: > Hello everyone, > > I try to add an array to another (to build a 2D array). > > I try that > > > > a = np.zeros(2) > b=np.array([2,4]) > a[[0]] = b > > print a > [2,0] > > And I want a= [[2,4],0] > > But that is not a 2D array. Do you want to "stack" b above the zeros? Perhaps something like this: In [7]: a = np.zeros(2) In [8]: b = np.array([2,4]) In [9]: c = np.vstack((b,a)) In [10]: c Out[10]: array([[ 2., 4.], [ 0., 0.]]) Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From gustavo.goretkin at gmail.com Wed Feb 1 11:36:38 2012 From: gustavo.goretkin at gmail.com (Gustavo Goretkin) Date: Wed, 1 Feb 2012 11:36:38 -0500 Subject: [SciPy-User] masked recarray, recarray with one field of type "ndarray" In-Reply-To: References: Message-ID: Thanks for the help! Now is there any way to mask elements of a recarray? I should explain the application because I think I may be going about this the wrong way: I'll be building a tree and each node will have some attributes (for example, a matrix). I often have to iterate through every node of the tree and do a calculation -- something that I could do in a vectorized way with NumPy if all the attributes were stored in an array. So I thought I could represent the tree as a recarray (that I'd occasionally need to grow). I'd also need to delete nodes from the tree occasionally. I'd accomplish this by masking entries of the recarray. When I needed to add a node to the tree, I'd try to populate a masked entry before going to the end of the array. On Tue, Jan 31, 2012 at 9:33 AM, Warren Weckesser wrote: > > > On Tue, Jan 31, 2012 at 2:36 AM, Gustavo Goretkin > wrote: >> >> Does a recarray support masking? >> >> Can I have a recarray where one of the fields is an M-by-N ndarray >> (not recarray) of some dtype? >> ex: a = np.recarray(shape=(10),formats=['i4','f8','3-by-3 ndarray of >> dtype=float64']) > > > > Here's how it can be done with the dtype argument (in this case, the > "sub-arrays" are 3x5 float32): > > In [21]: dt = np.dtype([('id', int32), ('values', float32, (3,5))]) > > In [22]: a = np.recarray(shape=(3,), dtype=dt) > > In [23]: a.id > Out[23]: array([????? 7, 2345536, 8585218]) > > In [24]: a[0].id > Out[24]: 7 > > In [25]: a[0].values > Out[25]: > array([[? 9.80908925e-45,?? 2.15997513e-37,?? 3.16079124e-39, > ????????? 1.18408375e-38,?? 2.81552923e-38], > ?????? [? 2.13004362e-37,? -7.69011974e-02,?? 9.80908925e-45, > ????????? 9.80908925e-45,?? 3.62636667e-21], > ?????? [? 5.67059093e-24,?? 5.67095065e-24,?? 5.64768872e-24, > ????????? 7.86448908e+11,?? 0.00000000e+00]], dtype=float32) > > In [26]: a[0].values.shape > Out[26]: (3, 5) > > > Warren > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ralf.gommers at googlemail.com Wed Feb 1 12:12:49 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 1 Feb 2012 18:12:49 +0100 Subject: [SciPy-User] NumPy and SciPy test failures In-Reply-To: References: <20120201103754.63be48ec@mcgill.ca> Message-ID: On Wed, Feb 1, 2012 at 5:17 PM, Mads M. Hansen wrote: > 2012/2/1 Denis Laxalde : > > Mads M. Hansen wrote: > >> I have built NumPy 1.6.1 and SciPy 0.10.0 for Python 3.2 on a Fedora > >> 16 system and I used gfortran, but when I run the tests I get the > >> following failures and errors > > > > It's probably not scipy 0.10.0. Could you specify the exact versions of > > you have installed (e.g. from the header displayed by scipy tests)? > > > Here is the header > > >>> scipy.test('full') > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in /usr/lib64/python3.2/site-packages/numpy > SciPy version 0.10.0 > SciPy is installed in /usr/lib64/python3.2/site-packages/scipy > Python version 3.2.1 (default, Jul 11 2011, 18:54:42) [GCC 4.6.1 > 20110627 (Red Hat 4.6.1-1)] > nose version 1.1.2 > > I checked out the v.0.10.0 tag from the Git repository. > > The reason you're seeing the _minimize_neldermead failure, and probably some others, is likely that you didn't clean the install dir before installing 0.10.0. That test was only added after 0.10.0 came out. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Feb 1 12:47:52 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 1 Feb 2012 18:47:52 +0100 Subject: [SciPy-User] asarray_chkfinite In-Reply-To: <9394EDC3-D458-4697-86AC-75115C14AA62@toadhill.net> References: <9394EDC3-D458-4697-86AC-75115C14AA62@toadhill.net> Message-ID: On Wed, Feb 1, 2012 at 5:16 PM, glen at toadhill.net wrote: > Hi all, > > I'm trying to optimize some code that entails a very large number of > sparse matrix-vector and vctor-vector multiplies. Upon running the > profiler I see that about 25% of my program's cumulative time is spent > running asarray_chkfinite. I do not call this routine directly. Can > anyone tell me what might be calling it and whether there is anything > obvious I can do about it? > It is called by many routines in order to check input arrays for bad data (inf/nan) that can cause crashes. A proposed change to allow disabling these checks is at https://github.com/scipy/scipy/pull/48. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From madsmh at gmail.com Wed Feb 1 14:05:57 2012 From: madsmh at gmail.com (Mads M. Hansen) Date: Wed, 1 Feb 2012 20:05:57 +0100 Subject: [SciPy-User] NumPy and SciPy test failures In-Reply-To: References: <20120201103754.63be48ec@mcgill.ca> Message-ID: 2012/2/1 Ralf Gommers : > > > On Wed, Feb 1, 2012 at 5:17 PM, Mads M. Hansen wrote: >> >> 2012/2/1 Denis Laxalde : >> > Mads M. Hansen wrote: >> >> I have built NumPy 1.6.1 and SciPy 0.10.0 for Python 3.2 on a Fedora >> >> 16 system and I used gfortran, but when I run the tests I get the >> >> following failures and errors >> > >> > It's probably not scipy 0.10.0. Could you specify the exact versions of >> > you have installed (e.g. from the header displayed by scipy tests)? >> > >> Here is the header >> >> >>> scipy.test('full') >> Running unit tests for scipy >> NumPy version 1.6.1 >> NumPy is installed in /usr/lib64/python3.2/site-packages/numpy >> SciPy version 0.10.0 >> SciPy is installed in /usr/lib64/python3.2/site-packages/scipy >> Python version 3.2.1 (default, Jul 11 2011, 18:54:42) [GCC 4.6.1 >> 20110627 (Red Hat 4.6.1-1)] >> nose version 1.1.2 >> >> I checked out the v.0.10.0 tag from the Git repository. >> > The reason you're seeing the _minimize_neldermead failure, and probably some > others, is likely that you didn't clean the install dir before installing > 0.10.0. That test was only added after 0.10.0 came out. > > Ralf Hm that's odd, I delete /usr/lib64/python3.2/site-packages/scipy (and numpy/) prior to each reinstall - is scipy installed in other places by defaultr? From pav at iki.fi Wed Feb 1 14:17:08 2012 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 01 Feb 2012 20:17:08 +0100 Subject: [SciPy-User] NumPy and SciPy test failures In-Reply-To: References: <20120201103754.63be48ec@mcgill.ca> Message-ID: 01.02.2012 20:05, Mads M. Hansen kirjoitti: > Hm that's odd, I delete /usr/lib64/python3.2/site-packages/scipy (and > numpy/) prior to each reinstall - is scipy installed in > other places by defaultr? You'll also need to delete the build/ directory. -- Pauli Virtanen From madsmh at gmail.com Wed Feb 1 14:41:28 2012 From: madsmh at gmail.com (Mads M. Hansen) Date: Wed, 1 Feb 2012 20:41:28 +0100 Subject: [SciPy-User] NumPy and SciPy test failures In-Reply-To: References: <20120201103754.63be48ec@mcgill.ca> Message-ID: Thanks, properly cleaning before building brought the errors down to one which follows, ====================================================================== ERROR: Failure: AttributeError ('module' object has no attribute 'FileType') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3.2/site-packages/nose/failure.py", line 37, in runTest raise self.exc_class(self.exc_val).with_traceback(self.tb) File "/usr/lib/python3.2/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/usr/lib/python3.2/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/usr/lib/python3.2/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python3.2/site-packages/scipy/weave/__init__.py", line 22, in from .blitz_tools import blitz File "/usr/lib64/python3.2/site-packages/scipy/weave/blitz_tools.py", line 6, in from . import converters File "/usr/lib64/python3.2/site-packages/scipy/weave/converters.py", line 19, in c_spec.file_converter(), File "/usr/lib64/python3.2/site-packages/scipy/weave/c_spec.py", line 74, in __init__ self.init_info() File "/usr/lib64/python3.2/site-packages/scipy/weave/c_spec.py", line 264, in init_info self.matching_types = [types.FileType] AttributeError: 'module' object has no attribute 'FileType' ---------------------------------------------------------------------- 2012/2/1 Pauli Virtanen : > 01.02.2012 20:05, Mads M. Hansen kirjoitti: >> Hm that's odd, I delete /usr/lib64/python3.2/site-packages/scipy (and >> numpy/) prior to each reinstall - is scipy installed in >> other places by defaultr? > > You'll also need to delete the build/ directory. > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From ralf.gommers at googlemail.com Wed Feb 1 16:29:34 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 1 Feb 2012 22:29:34 +0100 Subject: [SciPy-User] NumPy and SciPy test failures In-Reply-To: References: <20120201103754.63be48ec@mcgill.ca> Message-ID: On Wed, Feb 1, 2012 at 8:41 PM, Mads M. Hansen wrote: > Thanks, properly cleaning before building brought the errors down to > one which follows, > > ====================================================================== > ERROR: Failure: AttributeError ('module' object has no attribute > 'FileType') > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python3.2/site-packages/nose/failure.py", line 37, in > runTest > raise self.exc_class(self.exc_val).with_traceback(self.tb) > File "/usr/lib/python3.2/site-packages/nose/loader.py", line 390, in > loadTestsFromName > addr.filename, addr.module) > File "/usr/lib/python3.2/site-packages/nose/importer.py", line 39, > in importFromPath > return self.importFromDir(dir_path, fqname) > File "/usr/lib/python3.2/site-packages/nose/importer.py", line 86, > in importFromDir > mod = load_module(part_fqname, fh, filename, desc) > File "/usr/lib64/python3.2/site-packages/scipy/weave/__init__.py", > line 22, in > from .blitz_tools import blitz > File "/usr/lib64/python3.2/site-packages/scipy/weave/blitz_tools.py", > line 6, in > from . import converters > File "/usr/lib64/python3.2/site-packages/scipy/weave/converters.py", > line 19, in > c_spec.file_converter(), > File "/usr/lib64/python3.2/site-packages/scipy/weave/c_spec.py", > line 74, in __init__ > self.init_info() > File "/usr/lib64/python3.2/site-packages/scipy/weave/c_spec.py", > line 264, in init_info > self.matching_types = [types.FileType] > AttributeError: 'module' object has no attribute 'FileType' > > ---------------------------------------------------------------------- > > This failure is caused by the weave module not being py3k compatible. Is known. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From lafont.fabien at gmail.com Thu Feb 2 03:34:42 2012 From: lafont.fabien at gmail.com (Fabien Lafont) Date: Thu, 2 Feb 2012 09:34:42 +0100 Subject: [SciPy-User] [scipy-user] How to add a 1D np.array to another np.array? In-Reply-To: References: Message-ID: In fact I would prefer, replace the first 0 by [2,4] but it still interesting, thx! Fabien 2012/2/1 Warren Weckesser : > > > On Wed, Feb 1, 2012 at 10:21 AM, Fabien Lafont > wrote: >> >> Hello everyone, >> >> I try to add an array to another (to build a 2D array). >> >> I try that >> >> >> >> a = np.zeros(2) >> b=np.array([2,4]) >> a[[0]] = b >> >> print a >> [2,0] >> >> And I want a= [[2,4],0] >> > > But that is not a 2D array.? Do you want to "stack" b above the zeros? > Perhaps something like this: > > In [7]: a = np.zeros(2) > > In [8]: b = np.array([2,4]) > > In [9]: c = np.vstack((b,a)) > > In [10]: c > Out[10]: > array([[ 2.,? 4.], > ?????? [ 0.,? 0.]]) > > > Warren > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From scipy at samueljohn.de Thu Feb 2 04:58:01 2012 From: scipy at samueljohn.de (Samuel John) Date: Thu, 2 Feb 2012 10:58:01 +0100 Subject: [SciPy-User] [scipy-user] How to add a 1D np.array to another np.array? In-Reply-To: References: Message-ID: Hi Fabien! On 02.02.2012, at 09:34, Fabien Lafont wrote: > In fact I would prefer, replace the first 0 by [2,4] but it still > interesting, thx! ...that is only possible with lists of lists. Arrays have to have the same dimension for each entry. bests, Samuel From lamblinp at iro.umontreal.ca Thu Feb 2 11:52:44 2012 From: lamblinp at iro.umontreal.ca (Pascal Lamblin) Date: Thu, 2 Feb 2012 17:52:44 +0100 Subject: [SciPy-User] Indexing sparse matrices with step Message-ID: <20120202165243.GA1808@bob.blip.be> Hi everybody, I've noticed that if I have a scipy.sparse matrix (csr or csc), and I try to index into it with slices, the "step" component of my slice seems to be silently ignored. Is it an expected behaviour? I would have expected an error saying only a step of None (or not providing a step at all) is supported. Here is a small test case: import numpy, scipy.sparse sm = scipy.sparse.csc_matrix([[1, 0, 0], [0, 0, 0], [0, 0, 0], [0, 1, 0]]) # True, expected numpy.all(sm[:1,:].toarray() == sm.toarray()[:1,:]) # False, expected numpy.all(sm.toarray()[:1,:] == sm.toarray()[:1:-1,:]) # True, unexpected numpy.all(sm[:1:-1,:].toarray() == sm[:1,:].toarray()) # False, unexpected numpy.all(sm[:1:-1,:].toarray() == sm.toarray()[:1:-1,:]) Thanks in advance, -- Pascal From warren.weckesser at enthought.com Thu Feb 2 13:16:20 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 2 Feb 2012 12:16:20 -0600 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin wrote: > Yes, but for large data sets loops is quite slow. I have tried Pandas > groupby.sum() and it works faster. > > Pandas is probably the correct tool to use for this, but it will be nice when numpy has a native "group-by" capability. For what its worth (had to scratch the itch, so to speak), the attached script provides a "pure numpy" implementation without a python loop. The output of the script is In [53]: run pseudo_group_by.py Label Data 20 [1 2 3] 20 [1 2 4] 10 [3 3 1] 0 [5 0 0] 20 [1 9 0] 10 [2 3 4] 20 [9 9 1] Label Num. Sum 0 1 [5 0 0] 10 2 [5 6 5] 20 4 [12 22 8] A drawback of the method is that it will make a reordered copy of the data. I haven't compared the performance to pandas. Warren > > 2012/2/1 Fr?d?ric Bastien > >> It will be slow, but you can make a python loop. >> >> Fred >> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" >> wrote: >> >>> Hello! >>> >>> I use SciPy in computer graphics applications. My task is to calculate >>> vertex normals by averaging faces normals. In other words I want to >>> accumulate vectors with the same ids. For example, >>> >>> ids = numpy.array([0, 1, 1, 2]) >>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], >>> [0.1, 0.1 0.1] ]) >>> >>> I need result: >>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) >>> >>> The most simple code: >>> nv[ids] += n >>> does not work, I know about this. For 1D arrays I use >>> numpy.bincount(...) function. But this function does not work for 2D arrays. >>> >>> So, my question. What is the best way calculate accumulation sum for 2D >>> arrays using indirect indexes? >>> >>> Sincerely, >>> Alexander >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: pseudo_group_by.py Type: application/octet-stream Size: 1455 bytes Desc: not available URL: From josef.pktd at gmail.com Thu Feb 2 14:01:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 2 Feb 2012 14:01:10 -0500 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: On Thu, Feb 2, 2012 at 1:16 PM, Warren Weckesser wrote: > > > On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin > wrote: >> >> Yes, but for large data sets loops is quite slow. I have tried Pandas >> groupby.sum() and it works faster. >> > > > Pandas is probably the correct tool to use for this, but it will be nice > when numpy has a native "group-by" capability. > > For what its worth (had to scratch the itch, so to speak), the attached > script provides a "pure numpy" implementation without a python loop.? The > output of the script is > > In [53]: run pseudo_group_by.py > Label?? Data > ?20??? [1 2 3] > ?20??? [1 2 4] > ?10??? [3 3 1] > ? 0??? [5 0 0] > ?20??? [1 9 0] > ?10??? [2 3 4] > ?20??? [9 9 1] > > Label? Num.?? Sum > ? 0???? 1?? [5 0 0] > ?10???? 2?? [5 6 5] > ?20???? 4?? [12 22? 8] > > > A drawback of the method is that it will make a reordered copy of the data. > I haven't compared the performance to pandas. nice use of reduceat, I found it recently in an example but haven't used it yet. It looks convenient if labels are presorted and numeric. Josef > > Warren > > >> >> >> 2012/2/1 Fr?d?ric Bastien >>> >>> It will be slow, but you can make a python loop. >>> >>> Fred >>> >>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" >>> wrote: >>>> >>>> Hello! >>>> >>>> I use SciPy in computer graphics applications. My task is to calculate >>>> vertex normals by averaging faces normals. In other words I want to >>>> accumulate vectors with the same ids. For example, >>>> >>>> ids = numpy.array([0, 1, 1, 2]) >>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], >>>> [0.1, 0.1 0.1] ]) >>>> >>>> I need result: >>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) >>>> >>>> The most simple code: >>>> nv[ids] += n >>>> does not work, I know about this. For 1D arrays I use >>>> numpy.bincount(...) function. But this function does not work for 2D arrays. >>>> >>>> So, my question. What is the best way calculate accumulation sum for 2D >>>> arrays using indirect indexes? >>>> >>>> Sincerely, >>>> Alexander >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From travis at continuum.io Thu Feb 2 14:11:55 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 2 Feb 2012 13:11:55 -0600 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: On Feb 2, 2012, at 1:01 PM, josef.pktd at gmail.com wrote: > On Thu, Feb 2, 2012 at 1:16 PM, Warren Weckesser > wrote: >> >> >> On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin >> wrote: >>> >>> Yes, but for large data sets loops is quite slow. I have tried Pandas >>> groupby.sum() and it works faster. >>> >> >> >> Pandas is probably the correct tool to use for this, but it will be nice >> when numpy has a native "group-by" capability. >> >> For what its worth (had to scratch the itch, so to speak), the attached >> script provides a "pure numpy" implementation without a python loop. The >> output of the script is >> >> In [53]: run pseudo_group_by.py >> Label Data >> 20 [1 2 3] >> 20 [1 2 4] >> 10 [3 3 1] >> 0 [5 0 0] >> 20 [1 9 0] >> 10 [2 3 4] >> 20 [9 9 1] >> >> Label Num. Sum >> 0 1 [5 0 0] >> 10 2 [5 6 5] >> 20 4 [12 22 8] >> >> >> A drawback of the method is that it will make a reordered copy of the data. >> I haven't compared the performance to pandas. > > nice use of reduceat, I found it recently in an example but haven't used it yet. > It looks convenient if labels are presorted and numeric. Reduceat is pretty convenient, but it's limited right now because you have to have contiguous fence-posts for your reductions. There is a NEP with the group-by nep to make a reduce that takes in arbitrary index-ranges for reductions. -Travis > > Josef > >> >> Warren >> >> >>> >>> >>> 2012/2/1 Fr?d?ric Bastien >>>> >>>> It will be slow, but you can make a python loop. >>>> >>>> Fred >>>> >>>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" >>>> wrote: >>>>> >>>>> Hello! >>>>> >>>>> I use SciPy in computer graphics applications. My task is to calculate >>>>> vertex normals by averaging faces normals. In other words I want to >>>>> accumulate vectors with the same ids. For example, >>>>> >>>>> ids = numpy.array([0, 1, 1, 2]) >>>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], >>>>> [0.1, 0.1 0.1] ]) >>>>> >>>>> I need result: >>>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) >>>>> >>>>> The most simple code: >>>>> nv[ids] += n >>>>> does not work, I know about this. For 1D arrays I use >>>>> numpy.bincount(...) function. But this function does not work for 2D arrays. >>>>> >>>>> So, my question. What is the best way calculate accumulation sum for 2D >>>>> arrays using indirect indexes? >>>>> >>>>> Sincerely, >>>>> Alexander >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Thu Feb 2 14:29:59 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 2 Feb 2012 14:29:59 -0500 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: On Thu, Feb 2, 2012 at 2:11 PM, Travis Oliphant wrote: > > On Feb 2, 2012, at 1:01 PM, josef.pktd at gmail.com wrote: > >> On Thu, Feb 2, 2012 at 1:16 PM, Warren Weckesser >> wrote: >>> >>> >>> On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin >>> wrote: >>>> >>>> Yes, but for large data sets loops is quite slow. I have tried Pandas >>>> groupby.sum() and it works faster. >>>> >>> >>> >>> Pandas is probably the correct tool to use for this, but it will be nice >>> when numpy has a native "group-by" capability. >>> >>> For what its worth (had to scratch the itch, so to speak), the attached >>> script provides a "pure numpy" implementation without a python loop. ?The >>> output of the script is >>> >>> In [53]: run pseudo_group_by.py >>> Label ? Data >>> ?20 ? ?[1 2 3] >>> ?20 ? ?[1 2 4] >>> ?10 ? ?[3 3 1] >>> ? 0 ? ?[5 0 0] >>> ?20 ? ?[1 9 0] >>> ?10 ? ?[2 3 4] >>> ?20 ? ?[9 9 1] >>> >>> Label ?Num. ? Sum >>> ? 0 ? ? 1 ? [5 0 0] >>> ?10 ? ? 2 ? [5 6 5] >>> ?20 ? ? 4 ? [12 22 ?8] >>> >>> >>> A drawback of the method is that it will make a reordered copy of the data. >>> I haven't compared the performance to pandas. >> >> nice use of reduceat, I found it recently in an example but haven't used it yet. >> It looks convenient if labels are presorted and numeric. > > Reduceat is pretty convenient, but it's limited right now because you have to have contiguous fence-posts for your reductions. ? There is a NEP with the group-by nep to make a reduce that takes in arbitrary index-ranges for reductions. I have been looking forward for the group-by for a long time, but I would also be happy with a bincount that takes a 2d or nd weights matrix. Josef > > -Travis > > >> >> Josef >> >>> >>> Warren >>> >>> >>>> >>>> >>>> 2012/2/1 Fr?d?ric Bastien >>>>> >>>>> It will be slow, but you can make a python loop. >>>>> >>>>> Fred >>>>> >>>>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" >>>>> wrote: >>>>>> >>>>>> Hello! >>>>>> >>>>>> I use SciPy in computer graphics applications. My task is to calculate >>>>>> vertex normals by averaging faces normals. In other words I want to >>>>>> accumulate vectors with the same ids. For example, >>>>>> >>>>>> ids = numpy.array([0, 1, 1, 2]) >>>>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], >>>>>> [0.1, 0.1 0.1] ]) >>>>>> >>>>>> I need result: >>>>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) >>>>>> >>>>>> The most simple code: >>>>>> nv[ids] += n >>>>>> does not work, I know about this. For 1D arrays I use >>>>>> numpy.bincount(...) function. But this function does not work for 2D arrays. >>>>>> >>>>>> So, my question. What is the best way calculate accumulation sum for 2D >>>>>> arrays using indirect indexes? >>>>>> >>>>>> Sincerely, >>>>>> Alexander >>>>>> >>>>>> _______________________________________________ >>>>>> SciPy-User mailing list >>>>>> SciPy-User at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From denis.laxalde at mcgill.ca Wed Feb 1 11:34:20 2012 From: denis.laxalde at mcgill.ca (Denis Laxalde) Date: Wed, 1 Feb 2012 11:34:20 -0500 Subject: [SciPy-User] NumPy and SciPy test failures In-Reply-To: References: <20120201103754.63be48ec@mcgill.ca> Message-ID: <20120201113420.62d6d3d4@mcgill.ca> Mads M. Hansen wrote: > I checked out the v.0.10.0 tag from the Git repository. But this file (quoting the first error in scipy's test from your original message): > > File "/usr/lib64/python3.2/site-packages/scipy/optimize/tests/test_anneal.py" is not in 0.10.0. See Maybe your local repository was not clean when you built the package? -- Denis From warren.weckesser at enthought.com Thu Feb 2 21:46:42 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 2 Feb 2012 20:46:42 -0600 Subject: [SciPy-User] Indexing sparse matrices with step In-Reply-To: <20120202165243.GA1808@bob.blip.be> References: <20120202165243.GA1808@bob.blip.be> Message-ID: On Thu, Feb 2, 2012 at 10:52 AM, Pascal Lamblin wrote: > Hi everybody, > > I've noticed that if I have a scipy.sparse matrix (csr or csc), and I > try to index into it with slices, the "step" component of my slice seems > to be silently ignored. > > Is it an expected behaviour? I would have expected an error saying only > a step of None (or not providing a step at all) is supported. > > Here is a small test case: > > import numpy, scipy.sparse > > sm = scipy.sparse.csc_matrix([[1, 0, 0], [0, 0, 0], [0, 0, 0], [0, 1, 0]]) > > # True, expected > numpy.all(sm[:1,:].toarray() == sm.toarray()[:1,:]) > > # False, expected > numpy.all(sm.toarray()[:1,:] == sm.toarray()[:1:-1,:]) > > # True, unexpected > numpy.all(sm[:1:-1,:].toarray() == sm[:1,:].toarray()) > > # False, unexpected > numpy.all(sm[:1:-1,:].toarray() == sm.toarray()[:1:-1,:]) > > Looks like a bug. I've created a ticket: http://projects.scipy.org/scipy/ticket/1592 Thanks for reporting the problem. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From lafont.fabien at gmail.com Fri Feb 3 05:48:50 2012 From: lafont.fabien at gmail.com (Fabien Lafont) Date: Fri, 3 Feb 2012 11:48:50 +0100 Subject: [SciPy-User] [scipy-user] How to apply a condition on some specific values of an array Message-ID: I think my title is not very clear, but I don't know how to formulate it... I'm just starting to use numpy and I have still Python's reflexes so I want to know how can I do the following code using Numpy "style". for i in range(0,len(array)+1): if 10 References: Message-ID: On 3 February 2012 12:48, Fabien Lafont wrote: > I'm just starting to use numpy and I have still Python's reflexes so I > want to know how can I do the following code using Numpy "style". > > for i in range(0,len(array)+1): > ? ? ? if 10 ? ? ? ? ? ?new_array = array[i]*1000 > > In other words is it possible to "scan" the values of an array and > apply a "modification" to it if the condition is true Yes - you can use fancy indexing (see http://docs.scipy.org/doc/numpy/user/basics.indexing.html) In[1]: import numpy as np In[2]: arr = np.arange(10) In[3]: arr Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In[4]: arr[(2 < arr) & (arr < 9)] *= 1000 In[5]: arr Out[5]: array([ 0, 1, 2, 3000, 4000, 5000, 6000, 7000, 8000, 9]) Cheers, Scott From madsmh at gmail.com Fri Feb 3 07:13:28 2012 From: madsmh at gmail.com (Mads M. Hansen) Date: Fri, 3 Feb 2012 13:13:28 +0100 Subject: [SciPy-User] NumPy and SciPy test failures In-Reply-To: <20120201113420.62d6d3d4@mcgill.ca> References: <20120201103754.63be48ec@mcgill.ca> <20120201113420.62d6d3d4@mcgill.ca> Message-ID: 2012/2/1 Denis Laxalde : > Mads M. Hansen wrote: >> I checked out the v.0.10.0 tag from the Git repository. > > But this file (quoting the first error in scipy's test from your > original message): >> > ? File "/usr/lib64/python3.2/site-packages/scipy/optimize/tests/test_anneal.py" > > is not in 0.10.0. See > > > Maybe your local repository was not clean when you built the package? Hi Dennis - that was indeed the case - I forgot to run git clean -xdf in the repo. .. Mads From lafont.fabien at gmail.com Fri Feb 3 08:19:39 2012 From: lafont.fabien at gmail.com (Fabien Lafont) Date: Fri, 3 Feb 2012 14:19:39 +0100 Subject: [SciPy-User] [scipy-user] How to apply a condition on some specific values of an array In-Reply-To: References: Message-ID: thx Scott! 2012/2/3 Scott Sinclair : > On 3 February 2012 12:48, Fabien Lafont wrote: >> I'm just starting to use numpy and I have still Python's reflexes so I >> want to know how can I do the following code using Numpy "style". >> >> for i in range(0,len(array)+1): >> ? ? ? if 10> ? ? ? ? ? ?new_array = array[i]*1000 >> >> In other words is it possible to "scan" the values of an array and >> apply a "modification" to it if the condition is true > > Yes - you can use fancy indexing (see > http://docs.scipy.org/doc/numpy/user/basics.indexing.html) > > In[1]: import numpy as np > > In[2]: arr = np.arange(10) > In[3]: arr > Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > In[4]: arr[(2 < arr) & (arr < 9)] *= 1000 > In[5]: arr > Out[5]: array([ ? 0, ? ?1, ? ?2, 3000, 4000, 5000, 6000, 7000, 8000, ? ?9]) > > Cheers, > Scott > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From dg.gmane at thesamovar.net Sat Feb 4 00:03:04 2012 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Sat, 04 Feb 2012 06:03:04 +0100 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: For the project I'm working on we have quite a specific case of this to handle, where we have (1) generally have few repeats of the ids, (2) arbitrary operations to be applied, not just addition. I've just been working on an optimised numpy-only solution to this and it might be of interest to others. It works particularly well with few repeats, but I think it's no slower than other solutions if there are many repeats, at least until it gets to be mostly repeats at which points doing a simple loop is faster. For the case of just addition (the case below), a method using sorting and reduceat is probably quicker (I didn't do a comparison), but I thought it might be useful for many people to have an efficient solution for the general case. And if anyone knows a better one, I'd be very interested! It's still far from close to ideal, for the typical case it's about 10-20x slower than doing it with C++ (I used weave to test it), but also about 10-20x faster than doing it with a loop. I've attached the code (function apply_batch, the others are for comparison). If anyone's interested I can comment on the code, but it's basically the trick used by unique(), sorting the indices and comparing adjacent ones. Dan On 31/01/2012 21:34, Alexander Kalinin wrote: > Hello! > > I use SciPy in computer graphics applications. My task is to calculate > vertex normals by averaging faces normals. In other words I want to > accumulate vectors with the same ids. For example, > > ids = numpy.array([0, 1, 1, 2]) > n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], > [0.1, 0.1 0.1] ]) > > I need result: > nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) > > The most simple code: > nv[ids] += n > does not work, I know about this. For 1D arrays I use > numpy.bincount(...) function. But this function does not work for 2D arrays. > > So, my question. What is the best way calculate accumulation sum for 2D > arrays using indirect indexes? > > Sincerely, > Alexander > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: batch_apply.py URL: From D.Richards at mmu.ac.uk Sat Feb 4 05:48:22 2012 From: D.Richards at mmu.ac.uk (Dan Richards) Date: Sat, 4 Feb 2012 10:48:22 +0000 Subject: [SciPy-User] scipy.spatial.Delaunay.convex_hull problelm Message-ID: <000101cce32a$83dd1ee0$8b975ca0$@Richards@mmu.ac.uk> Hi All, I have been using scipy to find the Delaunay tetrahedron of a set of points in three-dimensions. However, now I wish to only generate the external faces of the tetrahedron. I assume this can be done using scipy.spatial.Delaunay.convex_hull? For my three-dimensional tetrahedron I am using this: import scipy from scipy import spatial Points = ([x1,y1,z1], [x2,y2,z2]...[xn,yn,zn]) Del = scipy.spatial.Delaunay(Points) faces = [] v = x.vertices for i in xrange(x.nsimplex): faces.extend([ (v[i,0],v[i,1],v[i,2]), (v[i,1],v[i,3],v[i,2]), (v[i,0],v[i,3],v[i,1]), (v[i,0],v[i,2],v[i,3]),]) for i in faces: MakeLines(i[0],i[1],i[2]) This allows me to create a three-dimensional tetragedron. I had thought to find the 3D convex hull could simply change either: "v = x.verticies" into "v=x.convex_hull" ; or "Del = scipy.spatial.Delaunay (Points)" into "Del = scipy.spatial.Delaunay.convex_hull(Points)".However, neither of these have worked as planned? If anyone is able to give me some advice or simply point me in the right direction that would be much appreciated. Thanks, Dan "Before acting on this email or opening any attachments you should read the Manchester Metropolitan University email disclaimer available on its website http://www.mmu.ac.uk/emaildisclaimer " -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat Feb 4 08:13:52 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 04 Feb 2012 14:13:52 +0100 Subject: [SciPy-User] scipy.spatial.Delaunay.convex_hull problelm In-Reply-To: <36924.2006949664$1328352535@news.gmane.org> References: <36924.2006949664$1328352535@news.gmane.org> Message-ID: 04.02.2012 11:48, Dan Richards kirjoitti: [clip] > This allows me to create a three-dimensional tetragedron. I had thought > to find the 3D convex hull could simply change either: ?v = x.verticies? > into ?v=x.*convex_hull*? ; or ?Del = scipy.spatial.Delaunay (Points)? > into ?Del = scipy.spatial.Delaunay.*convex_hull*(Points)?.However, > neither of these have worked as planned? Elements of the convex hull are triangles, not tetrahedra, so you need to change the code also below. for i1, i2, i3 in Del.convex_hull: faces.extend([(v[i1,0], ............), .... (.... v[i3,3]),]) From pav at iki.fi Sat Feb 4 10:53:05 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 04 Feb 2012 16:53:05 +0100 Subject: [SciPy-User] scipy.spatial.Delaunay.convex_hull problelm In-Reply-To: References: <36924.2006949664$1328352535@news.gmane.org> Message-ID: 04.02.2012 14:13, Pauli Virtanen kirjoitti: > 04.02.2012 11:48, Dan Richards kirjoitti: > [clip] >> This allows me to create a three-dimensional tetragedron. I had thought >> to find the 3D convex hull could simply change either: ?v = x.verticies? >> into ?v=x.*convex_hull*? ; or ?Del = scipy.spatial.Delaunay (Points)? >> into ?Del = scipy.spatial.Delaunay.*convex_hull*(Points)?.However, >> neither of these have worked as planned? > > Elements of the convex hull are triangles, not tetrahedra, so you need > to change the code also below. > > for i1, i2, i3 in Del.convex_hull: > faces.extend([(v[i1,0], ............), .... (.... v[i3,3]),]) Like so: import numpy as np from scipy.spatial import Delaunay points = np.random.randn(300, 3) tri = Delaunay(points) # -- Make a list of faces, [(p1, p2, p3), ...]; pj = (xj, yj, zj) faces = [] for ia, ib, ic in tri.convex_hull: faces.append(points[[ia, ib, ic]]) import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D from mpl_toolkits.mplot3d.art3d import Poly3DCollection fig = plt.figure() ax = fig.gca(projection='3d') items = Poly3DCollection(faces, facecolors=[(0, 0, 0, 0.1)]) ax.add_collection(items) ax.scatter(points[:,0], points[:,1], points[:,2], 'o') plt.show() From alec.kalinin at gmail.com Sat Feb 4 14:23:57 2012 From: alec.kalinin at gmail.com (Alexander Kalinin) Date: Sat, 4 Feb 2012 22:23:57 +0300 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: I have checked the performance of the "pure numpy" solution with pandas solution on my task. The "pure numpy" solution is about two times slower. The data shape: (1062, 6348) Pandas "group by sum" time: 0.16588 seconds Pure numpy "group by sum" time: 0.38979 seconds But it is interesting, that the main bottleneck in numpy solution is the data copying. I have divided solution on three blocks: # block (a): s = np.argsort(labels) keys, inv = np.unique(labels, return_inverse = True) i = inv[s] groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] # block (b): ordered_data = data[:, s] # block (c): group_sums = np.add.reduceat(ordered_data, groups_at, axis = 1) The timing for the blocks is: block (a): 0.00138 seconds block (b): 0.29285 seconds block (c): 0.08868 seconds The sorting and reduce_at procedures are very fast. But only one line: "ordered_data = data[:, s]" takes the most time. For me it is a bit strange. The reduceat() procedure where summation is executed is about 3 time faster than the only data copying. Alexander On Thu, Feb 2, 2012 at 10:16 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > > > On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin > wrote: > >> Yes, but for large data sets loops is quite slow. I have tried Pandas >> groupby.sum() and it works faster. >> >> > > Pandas is probably the correct tool to use for this, but it will be nice > when numpy has a native "group-by" capability. > > For what its worth (had to scratch the itch, so to speak), the attached > script provides a "pure numpy" implementation without a python loop. The > output of the script is > > In [53]: run pseudo_group_by.py > Label Data > 20 [1 2 3] > 20 [1 2 4] > 10 [3 3 1] > 0 [5 0 0] > 20 [1 9 0] > 10 [2 3 4] > 20 [9 9 1] > > Label Num. Sum > 0 1 [5 0 0] > 10 2 [5 6 5] > 20 4 [12 22 8] > > > A drawback of the method is that it will make a reordered copy of the > data. I haven't compared the performance to pandas. > > Warren > > > >> >> 2012/2/1 Fr?d?ric Bastien >> >>> It will be slow, but you can make a python loop. >>> >>> Fred >>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" >>> wrote: >>> >>>> Hello! >>>> >>>> I use SciPy in computer graphics applications. My task is to calculate >>>> vertex normals by averaging faces normals. In other words I want to >>>> accumulate vectors with the same ids. For example, >>>> >>>> ids = numpy.array([0, 1, 1, 2]) >>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], >>>> [0.1, 0.1 0.1] ]) >>>> >>>> I need result: >>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) >>>> >>>> The most simple code: >>>> nv[ids] += n >>>> does not work, I know about this. For 1D arrays I use >>>> numpy.bincount(...) function. But this function does not work for 2D arrays. >>>> >>>> So, my question. What is the best way calculate accumulation sum for 2D >>>> arrays using indirect indexes? >>>> >>>> Sincerely, >>>> Alexander >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Sat Feb 4 14:27:18 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Sat, 4 Feb 2012 14:27:18 -0500 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: On Sat, Feb 4, 2012 at 2:23 PM, Alexander Kalinin wrote: > I have checked the performance of the "pure numpy" solution with pandas > solution on my task. The "pure numpy" solution is about two times slower. > > The data shape: > ??? (1062, 6348) > Pandas "group by sum" time: > ??? 0.16588 seconds > Pure numpy "group by sum" time: > ??? 0.38979 seconds > > But it is interesting, that the main bottleneck in numpy solution is the > data copying. I have divided solution on three blocks: > > # block (a): > ? ? s = np.argsort(labels) > > keys, inv = np.unique(labels, return_inverse = True) > > i = inv[s] > > groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] > > > # block (b): > ??? ordered_data = data[:, s] > > # block (c): > ??? group_sums = np.add.reduceat(ordered_data, groups_at, axis = 1) > > The timing for the blocks is: > block (a): > ??? 0.00138 seconds > > block (b): > ??? 0.29285 seconds > > block (c): > ??? 0.08868 seconds > > The sorting and reduce_at procedures are very fast. But only one line: > "ordered_data = data[:, s]" takes the most time. > > For me it is a bit strange. The reduceat() procedure where summation is > executed is about 3 time faster than the only data copying. > > Alexander > > > On Thu, Feb 2, 2012 at 10:16 PM, Warren Weckesser > wrote: >> >> >> >> On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin >> wrote: >>> >>> Yes, but for large data sets loops is quite slow. I have tried Pandas >>> groupby.sum() and it works faster. >>> >> >> >> Pandas is probably the correct tool to use for this, but it will be nice >> when numpy has a native "group-by" capability. >> >> For what its worth (had to scratch the itch, so to speak), the attached >> script provides a "pure numpy" implementation without a python loop.? The >> output of the script is >> >> In [53]: run pseudo_group_by.py >> Label?? Data >> ?20??? [1 2 3] >> ?20??? [1 2 4] >> ?10??? [3 3 1] >> ? 0??? [5 0 0] >> ?20??? [1 9 0] >> ?10??? [2 3 4] >> ?20??? [9 9 1] >> >> Label? Num.?? Sum >> ? 0???? 1?? [5 0 0] >> ?10???? 2?? [5 6 5] >> ?20???? 4?? [12 22? 8] >> >> >> A drawback of the method is that it will make a reordered copy of the >> data.? I haven't compared the performance to pandas. >> >> Warren >> >> >>> >>> >>> 2012/2/1 Fr?d?ric Bastien >>>> >>>> It will be slow, but you can make a python loop. >>>> >>>> Fred >>>> >>>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" >>>> wrote: >>>>> >>>>> Hello! >>>>> >>>>> I use SciPy in computer graphics applications. My task is to calculate >>>>> vertex normals by averaging faces normals. In other words I want to >>>>> accumulate vectors with the same ids. For example, >>>>> >>>>> ids = numpy.array([0, 1, 1, 2]) >>>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], >>>>> [0.1, 0.1 0.1] ]) >>>>> >>>>> I need result: >>>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) >>>>> >>>>> The most simple code: >>>>> nv[ids] += n >>>>> does not work, I know about this. For 1D arrays I use >>>>> numpy.bincount(...) function. But this function does not work for 2D arrays. >>>>> >>>>> So, my question. What is the best way calculate accumulation sum for 2D >>>>> arrays using indirect indexes? >>>>> >>>>> Sincerely, >>>>> Alexander >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > I should point out that pandas is not very optimized for a large number of columns like this. I just created a github issue about it: https://github.com/wesm/pandas/issues/745 I'll get to it eventually - Wes From vanforeest at gmail.com Sat Feb 4 18:12:27 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sun, 5 Feb 2012 00:12:27 +0100 Subject: [SciPy-User] scipy.stats.poisson, strange output? Message-ID: Hi, I used two types of poisson, and obtained different results. Specifically: In [1]: from scipy.stats import poisson In [2]: import numpy as np In [3]: grid = np.arange(20) In [4]: rv = poisson(10) In [5]: print rv.pmf(grid) [ 4.53999298e-05 4.53999298e-04 2.26999649e-03 7.56665496e-03 1.89166374e-02 3.78332748e-02 6.30554580e-02 9.00792257e-02 1.12599032e-01 1.25110036e-01 1.25110036e-01 1.13736396e-01 9.47803301e-02 7.29079462e-02 5.20771044e-02 3.47180696e-02 2.16987935e-02 1.27639962e-02 7.09110899e-03 3.73216263e-03] In [6]: print poisson.pmf(10., grid) [ nan 1.01377712e-07 3.81898506e-05 8.10151179e-04 5.29247668e-03 1.81327887e-02 4.13030934e-02 7.09832687e-02 9.92615338e-02 1.18580076e-01 1.25110036e-01 1.19378060e-01 1.04837256e-01 8.58701508e-02 6.62818432e-02 4.86107508e-02 3.40976998e-02 2.29995844e-02 1.49851586e-02 9.46624674e-03] In [7]: So, in line [5], rv.pmf(grid)[0] is a number, while in [6], poisson.pmf(10,grid)[0] is nan. Am I doing something wrong, or is this an unintentional inconsistency? Nicky From vanforeest at gmail.com Sat Feb 4 18:32:23 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sun, 5 Feb 2012 00:32:23 +0100 Subject: [SciPy-User] scipy.stats.poisson, strange output? In-Reply-To: References: Message-ID: Hi, I have found my mistake. I should have called poisson.pmf(grid, 10.) rather than poisson.pmf(10, grid). Sorry for the spam. Nicky From josef.pktd at gmail.com Sat Feb 4 18:33:08 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 4 Feb 2012 18:33:08 -0500 Subject: [SciPy-User] scipy.stats.poisson, strange output? In-Reply-To: References: Message-ID: On Sat, Feb 4, 2012 at 6:12 PM, nicky van foreest wrote: > Hi, > > I used two types of poisson, and obtained different results. Specifically: > > In [1]: from scipy.stats import poisson > > In [2]: import numpy as np > > In [3]: grid = np.arange(20) > > In [4]: rv = poisson(10) > > In [5]: print rv.pmf(grid) > [ ?4.53999298e-05 ? 4.53999298e-04 ? 2.26999649e-03 ? 7.56665496e-03 > ? 1.89166374e-02 ? 3.78332748e-02 ? 6.30554580e-02 ? 9.00792257e-02 > ? 1.12599032e-01 ? 1.25110036e-01 ? 1.25110036e-01 ? 1.13736396e-01 > ? 9.47803301e-02 ? 7.29079462e-02 ? 5.20771044e-02 ? 3.47180696e-02 > ? 2.16987935e-02 ? 1.27639962e-02 ? 7.09110899e-03 ? 3.73216263e-03] > > In [6]: print poisson.pmf(10., grid) > [ ? ? ? ? ? ? nan ? 1.01377712e-07 ? 3.81898506e-05 ? 8.10151179e-04 > ? 5.29247668e-03 ? 1.81327887e-02 ? 4.13030934e-02 ? 7.09832687e-02 > ? 9.92615338e-02 ? 1.18580076e-01 ? 1.25110036e-01 ? 1.19378060e-01 > ? 1.04837256e-01 ? 8.58701508e-02 ? 6.62818432e-02 ? 4.86107508e-02 > ? 3.40976998e-02 ? 2.29995844e-02 ? 1.49851586e-02 ? 9.46624674e-03] wrong sequence of arguments, the shape (mean) argument should be second and first the values at which pmf is evaluated, i.e. >>> stats.poisson.pmf(grid, 10) array([ 0.0000453999297625, 0.0004539992976248, 0.0022699964881242, 0.0075666549604141, 0.0189166374010354, 0.0378332748020708, 0.0630554580034512, 0.090079225719216 , 0.1125990321490201, 0.1251100357211337, 0.1251100357211337, 0.1137363961101213, 0.094780330091768 , 0.0729079462244373, 0.0520771044460262, 0.0347180696306844, 0.0216987935191777, 0.0127639961877516, 0.0070911089931953, 0.0037321626279975]) in the first case it's a frozen distribution >>> stats.poisson(10).pmf(grid) array([ 0.0000453999297625, 0.0004539992976248, 0.0022699964881242, 0.0075666549604141, 0.0189166374010354, 0.0378332748020708, 0.0630554580034512, 0.090079225719216 , 0.1125990321490201, 0.1251100357211337, 0.1251100357211337, 0.1137363961101213, 0.094780330091768 , 0.0729079462244373, 0.0520771044460262, 0.0347180696306844, 0.0216987935191777, 0.0127639961877516, 0.0070911089931953, 0.0037321626279975]) Josef > > In [7]: > > > So, in line [5], rv.pmf(grid)[0] is a number, while in [6], > poisson.pmf(10,grid)[0] is nan. Am I doing something wrong, or is this > an unintentional inconsistency? > > Nicky > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Sat Feb 4 19:01:39 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 4 Feb 2012 19:01:39 -0500 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: On Sat, Feb 4, 2012 at 2:27 PM, Wes McKinney wrote: > On Sat, Feb 4, 2012 at 2:23 PM, Alexander Kalinin > wrote: >> I have checked the performance of the "pure numpy" solution with pandas >> solution on my task. The "pure numpy" solution is about two times slower. >> >> The data shape: >> ??? (1062, 6348) >> Pandas "group by sum" time: >> ??? 0.16588 seconds >> Pure numpy "group by sum" time: >> ??? 0.38979 seconds >> >> But it is interesting, that the main bottleneck in numpy solution is the >> data copying. I have divided solution on three blocks: >> >> # block (a): >> ? ? s = np.argsort(labels) >> >> keys, inv = np.unique(labels, return_inverse = True) >> >> i = inv[s] >> >> groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] >> >> >> # block (b): >> ??? ordered_data = data[:, s] can you try with numpy.take? Keith and Wes were showing that take is much faster than advanced indexing. Josef >> >> # block (c): >> ??? group_sums = np.add.reduceat(ordered_data, groups_at, axis = 1) >> >> The timing for the blocks is: >> block (a): >> ??? 0.00138 seconds >> >> block (b): >> ??? 0.29285 seconds >> >> block (c): >> ??? 0.08868 seconds >> >> The sorting and reduce_at procedures are very fast. But only one line: >> "ordered_data = data[:, s]" takes the most time. >> >> For me it is a bit strange. The reduceat() procedure where summation is >> executed is about 3 time faster than the only data copying. >> >> Alexander >> >> >> On Thu, Feb 2, 2012 at 10:16 PM, Warren Weckesser >> wrote: >>> >>> >>> >>> On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin >>> wrote: >>>> >>>> Yes, but for large data sets loops is quite slow. I have tried Pandas >>>> groupby.sum() and it works faster. >>>> >>> >>> >>> Pandas is probably the correct tool to use for this, but it will be nice >>> when numpy has a native "group-by" capability. >>> >>> For what its worth (had to scratch the itch, so to speak), the attached >>> script provides a "pure numpy" implementation without a python loop.? The >>> output of the script is >>> >>> In [53]: run pseudo_group_by.py >>> Label?? Data >>> ?20??? [1 2 3] >>> ?20??? [1 2 4] >>> ?10??? [3 3 1] >>> ? 0??? [5 0 0] >>> ?20??? [1 9 0] >>> ?10??? [2 3 4] >>> ?20??? [9 9 1] >>> >>> Label? Num.?? Sum >>> ? 0???? 1?? [5 0 0] >>> ?10???? 2?? [5 6 5] >>> ?20???? 4?? [12 22? 8] >>> >>> >>> A drawback of the method is that it will make a reordered copy of the >>> data.? I haven't compared the performance to pandas. >>> >>> Warren >>> >>> >>>> >>>> >>>> 2012/2/1 Fr?d?ric Bastien >>>>> >>>>> It will be slow, but you can make a python loop. >>>>> >>>>> Fred >>>>> >>>>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" >>>>> wrote: >>>>>> >>>>>> Hello! >>>>>> >>>>>> I use SciPy in computer graphics applications. My task is to calculate >>>>>> vertex normals by averaging faces normals. In other words I want to >>>>>> accumulate vectors with the same ids. For example, >>>>>> >>>>>> ids = numpy.array([0, 1, 1, 2]) >>>>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], >>>>>> [0.1, 0.1 0.1] ]) >>>>>> >>>>>> I need result: >>>>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) >>>>>> >>>>>> The most simple code: >>>>>> nv[ids] += n >>>>>> does not work, I know about this. For 1D arrays I use >>>>>> numpy.bincount(...) function. But this function does not work for 2D arrays. >>>>>> >>>>>> So, my question. What is the best way calculate accumulation sum for 2D >>>>>> arrays using indirect indexes? >>>>>> >>>>>> Sincerely, >>>>>> Alexander >>>>>> >>>>>> _______________________________________________ >>>>>> SciPy-User mailing list >>>>>> SciPy-User at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > I should point out that pandas is not very optimized for a large > number of columns like this. I just created a github issue about it: > > https://github.com/wesm/pandas/issues/745 > > I'll get to it eventually > > - Wes > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From warren.weckesser at enthought.com Sat Feb 4 19:28:24 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 4 Feb 2012 18:28:24 -0600 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: On Sat, Feb 4, 2012 at 6:01 PM, wrote: > On Sat, Feb 4, 2012 at 2:27 PM, Wes McKinney wrote: > > On Sat, Feb 4, 2012 at 2:23 PM, Alexander Kalinin > > wrote: > >> I have checked the performance of the "pure numpy" solution with pandas > >> solution on my task. The "pure numpy" solution is about two times > slower. > >> > >> The data shape: > >> (1062, 6348) > >> Pandas "group by sum" time: > >> 0.16588 seconds > >> Pure numpy "group by sum" time: > >> 0.38979 seconds > >> > >> But it is interesting, that the main bottleneck in numpy solution is the > >> data copying. I have divided solution on three blocks: > >> > >> # block (a): > >> s = np.argsort(labels) > >> > >> keys, inv = np.unique(labels, return_inverse = True) > >> > >> i = inv[s] > >> > >> groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] > >> > >> > >> # block (b): > >> ordered_data = data[:, s] > > can you try with numpy.take? Keith and Wes were showing that take is > much faster than advanced indexing. > Good idea. numpy.take is much faster: In [35]: data.shape Out[35]: (1000000, 3) In [36]: %timeit o = data[s] 10 loops, best of 3: 155 ms per loop In [37]: %timeit o = take(data, s, axis=0) 10 loops, best of 3: 37.1 ms per loop Warren > Josef > > >> > >> # block (c): > >> group_sums = np.add.reduceat(ordered_data, groups_at, axis = 1) > >> > >> The timing for the blocks is: > >> block (a): > >> 0.00138 seconds > >> > >> block (b): > >> 0.29285 seconds > >> > >> block (c): > >> 0.08868 seconds > >> > >> The sorting and reduce_at procedures are very fast. But only one line: > >> "ordered_data = data[:, s]" takes the most time. > >> > >> For me it is a bit strange. The reduceat() procedure where summation is > >> executed is about 3 time faster than the only data copying. > >> > >> Alexander > >> > >> > >> On Thu, Feb 2, 2012 at 10:16 PM, Warren Weckesser > >> wrote: > >>> > >>> > >>> > >>> On Wed, Feb 1, 2012 at 10:34 AM, Alexander Kalinin > >>> wrote: > >>>> > >>>> Yes, but for large data sets loops is quite slow. I have tried Pandas > >>>> groupby.sum() and it works faster. > >>>> > >>> > >>> > >>> Pandas is probably the correct tool to use for this, but it will be > nice > >>> when numpy has a native "group-by" capability. > >>> > >>> For what its worth (had to scratch the itch, so to speak), the attached > >>> script provides a "pure numpy" implementation without a python loop. > The > >>> output of the script is > >>> > >>> In [53]: run pseudo_group_by.py > >>> Label Data > >>> 20 [1 2 3] > >>> 20 [1 2 4] > >>> 10 [3 3 1] > >>> 0 [5 0 0] > >>> 20 [1 9 0] > >>> 10 [2 3 4] > >>> 20 [9 9 1] > >>> > >>> Label Num. Sum > >>> 0 1 [5 0 0] > >>> 10 2 [5 6 5] > >>> 20 4 [12 22 8] > >>> > >>> > >>> A drawback of the method is that it will make a reordered copy of the > >>> data. I haven't compared the performance to pandas. > >>> > >>> Warren > >>> > >>> > >>>> > >>>> > >>>> 2012/2/1 Fr?d?ric Bastien > >>>>> > >>>>> It will be slow, but you can make a python loop. > >>>>> > >>>>> Fred > >>>>> > >>>>> On Jan 31, 2012 3:34 PM, "Alexander Kalinin" > > >>>>> wrote: > >>>>>> > >>>>>> Hello! > >>>>>> > >>>>>> I use SciPy in computer graphics applications. My task is to > calculate > >>>>>> vertex normals by averaging faces normals. In other words I want to > >>>>>> accumulate vectors with the same ids. For example, > >>>>>> > >>>>>> ids = numpy.array([0, 1, 1, 2]) > >>>>>> n = numpy.array([ [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], [0.1, 0.1, 0.1], > >>>>>> [0.1, 0.1 0.1] ]) > >>>>>> > >>>>>> I need result: > >>>>>> nv = ([ [0.1, 0.1, 0.1], [0.2, 0.2, 0.2], [0.1, 0.1, 0.1]]) > >>>>>> > >>>>>> The most simple code: > >>>>>> nv[ids] += n > >>>>>> does not work, I know about this. For 1D arrays I use > >>>>>> numpy.bincount(...) function. But this function does not work for > 2D arrays. > >>>>>> > >>>>>> So, my question. What is the best way calculate accumulation sum > for 2D > >>>>>> arrays using indirect indexes? > >>>>>> > >>>>>> Sincerely, > >>>>>> Alexander > >>>>>> > >>>>>> _______________________________________________ > >>>>>> SciPy-User mailing list > >>>>>> SciPy-User at scipy.org > >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> SciPy-User mailing list > >>>>> SciPy-User at scipy.org > >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> SciPy-User mailing list > >>>> SciPy-User at scipy.org > >>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>> > >>> > >>> > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > >> > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > > > I should point out that pandas is not very optimized for a large > > number of columns like this. I just created a github issue about it: > > > > https://github.com/wesm/pandas/issues/745 > > > > I'll get to it eventually > > > > - Wes > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alec.kalinin at gmail.com Sun Feb 5 02:17:12 2012 From: alec.kalinin at gmail.com (Alexander Kalinin) Date: Sun, 5 Feb 2012 10:17:12 +0300 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: Yes, the numpy.take() is much faster than "fancy" indexing and now "pure numpy" solution is two time faster than pandas. Below are timing results: The data shape: (1062, 6348) Pandas solution: 0.16610 seconds "Pure numpy" solution: 0.08907 seconds Timing of the "pure numpy" by blocks: block (a) (sorting and obtaining groups): 0.00134 seconds block (b) (copy data to the ordered_data): 0.05517 seconds block (c) (reduceat): 0.02698 Alexander. On Sun, Feb 5, 2012 at 4:01 AM, wrote: > On Sat, Feb 4, 2012 at 2:27 PM, Wes McKinney wrote: > > On Sat, Feb 4, 2012 at 2:23 PM, Alexander Kalinin > > wrote: > >> I have checked the performance of the "pure numpy" solution with pandas > >> solution on my task. The "pure numpy" solution is about two times > slower. > >> > >> The data shape: > >> (1062, 6348) > >> Pandas "group by sum" time: > >> 0.16588 seconds > >> Pure numpy "group by sum" time: > >> 0.38979 seconds > >> > >> But it is interesting, that the main bottleneck in numpy solution is the > >> data copying. I have divided solution on three blocks: > >> > >> # block (a): > >> s = np.argsort(labels) > >> > >> keys, inv = np.unique(labels, return_inverse = True) > >> > >> i = inv[s] > >> > >> groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] > >> > >> > >> # block (b): > >> ordered_data = data[:, s] > > can you try with numpy.take? Keith and Wes were showing that take is > much faster than advanced indexing. > > Josef > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yosefmel at post.tau.ac.il Sun Feb 5 03:59:21 2012 From: yosefmel at post.tau.ac.il (Yosef Meller) Date: Sun, 05 Feb 2012 10:59:21 +0200 Subject: [SciPy-User] asarray_chkfinite In-Reply-To: <9394EDC3-D458-4697-86AC-75115C14AA62@toadhill.net> References: <9394EDC3-D458-4697-86AC-75115C14AA62@toadhill.net> Message-ID: <11969561.AY5P6bKH8p@yosef-pc> On Wednesday, 1 ?February 2012 11:16:37 glen at toadhill.net wrote: > Hi all, > > I'm trying to optimize some code that entails a very large number of sparse > matrix-vector and vctor-vector multiplies. Upon running the profiler I see > that about 25% of my program's cumulative time is spent running > asarray_chkfinite. I do not call this routine directly. Can anyone tell me > what might be calling it and whether there is anything obvious I can do > about it? In addition to what Ralph said, I recommend using pycallgraph to see who calls what. Yosef. -------------- next part -------------- An HTML attachment was scrubbed... URL: From D.Richards at mmu.ac.uk Sun Feb 5 09:36:28 2012 From: D.Richards at mmu.ac.uk (Daniel Richards) Date: Sun, 5 Feb 2012 14:36:28 +0000 Subject: [SciPy-User] scipy.spatial.Delaunay.convex_hull problelm In-Reply-To: References: <36924.2006949664$1328352535@news.gmane.org> , Message-ID: <69978DA9452B194B9467CF8E34D69723541A4E50@EXMB3.ad.mmu.ac.uk> Hi Pauli, This is great, it works! Thanks for the help. Dan ________________________________________ From: scipy-user-bounces at scipy.org [scipy-user-bounces at scipy.org] on behalf of Pauli Virtanen [pav at iki.fi] Sent: 04 February 2012 15:53 To: scipy-user at scipy.org Subject: Re: [SciPy-User] scipy.spatial.Delaunay.convex_hull problelm 04.02.2012 14:13, Pauli Virtanen kirjoitti: > 04.02.2012 11:48, Dan Richards kirjoitti: > [clip] >> This allows me to create a three-dimensional tetragedron. I had thought >> to find the 3D convex hull could simply change either: ?v = x.verticies? >> into ?v=x.*convex_hull*? ; or ?Del = scipy.spatial.Delaunay (Points)? >> into ?Del = scipy.spatial.Delaunay.*convex_hull*(Points)?.However, >> neither of these have worked as planned? > > Elements of the convex hull are triangles, not tetrahedra, so you need > to change the code also below. > > for i1, i2, i3 in Del.convex_hull: > faces.extend([(v[i1,0], ............), .... (.... v[i3,3]),]) Like so: import numpy as np from scipy.spatial import Delaunay points = np.random.randn(300, 3) tri = Delaunay(points) # -- Make a list of faces, [(p1, p2, p3), ...]; pj = (xj, yj, zj) faces = [] for ia, ib, ic in tri.convex_hull: faces.append(points[[ia, ib, ic]]) import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D from mpl_toolkits.mplot3d.art3d import Poly3DCollection fig = plt.figure() ax = fig.gca(projection='3d') items = Poly3DCollection(faces, facecolors=[(0, 0, 0, 0.1)]) ax.add_collection(items) ax.scatter(points[:,0], points[:,1], points[:,2], 'o') plt.show() _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user "Before acting on this email or opening any attachments you should read the Manchester Metropolitan University email disclaimer available on its website http://www.mmu.ac.uk/emaildisclaimer " From conradlee at gmail.com Sun Feb 5 10:05:39 2012 From: conradlee at gmail.com (Conrad Lee) Date: Sun, 5 Feb 2012 15:05:39 +0000 Subject: [SciPy-User] Can I copy a sparse matrix into an existing dense numpy matrix? Message-ID: Say I have a huge numpy matrix *A* taking up tens of gigabytes. It takes a non-negligible amount of time to allocate this memory. Let's say I also have a collection of scipy sparse matrices with the same dimensions as the numpy matrix. Sometimes I want to convert one of these sparse matrices into a dense matrix to perform some vectorized operations that can't be performed on sparse matrices. Can I load one of these sparse matrices into *A* rather than re-allocate space each time I want to convert a sparse matrix into a dense matrix? The .toarray() and .todense() methods which are available on scipy sparse matrices do not seem to take an optional dense array argument, but maybe there is some other way to do this. (I've also started a stackoverflow version of this question here .) Thanks, Conrad lee -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Sun Feb 5 10:21:01 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sun, 5 Feb 2012 09:21:01 -0600 Subject: [SciPy-User] Can I copy a sparse matrix into an existing dense numpy matrix? In-Reply-To: References: Message-ID: On Sun, Feb 5, 2012 at 9:05 AM, Conrad Lee wrote: > Say I have a huge numpy matrix *A* taking up tens of gigabytes. It takes > a non-negligible amount of time to allocate this memory. > > Let's say I also have a collection of scipy sparse matrices with the same > dimensions as the numpy matrix. Sometimes I want to convert one of these > sparse matrices into a dense matrix to perform some vectorized operations > that can't be performed on sparse matrices. > > Can I load one of these sparse matrices into *A* rather than re-allocate > space each time I want to convert a sparse matrix into a dense matrix? The > .toarray() and .todense() methods which are available on scipy sparse > matrices do not seem to take an optional dense array argument, but maybe > there is some other way to do this. > > (I've also started a stackoverflow version of this question here > .) > > Thanks, > > Conrad lee > > If your sparse matrix is in coo format, you can use fancy indexing to assign the values to the existing array. For example: In [29]: import scipy.sparse as sp In [30]: import numpy as np In [31]: a = sp.coo_matrix([[0,0,1,0],[0,0,0,0],[2,0,3,0],[0,4,0,0]]) In [32]: d = np.zeros((4,4), dtype=np.int32) In [33]: a.todense() Out[33]: matrix([[0, 0, 1, 0], [0, 0, 0, 0], [2, 0, 3, 0], [0, 4, 0, 0]]) In [34]: d[a.row, a.col] = a.data In [35]: d Out[35]: array([[0, 0, 1, 0], [0, 0, 0, 0], [2, 0, 3, 0], [0, 4, 0, 0]]) Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Sun Feb 5 15:41:39 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sun, 5 Feb 2012 21:41:39 +0100 Subject: [SciPy-User] scipy.stats.poisson, strange output? In-Reply-To: References: Message-ID: Hi Josef, > wrong sequence of arguments, the shape (mean) argument should be > second and first the values at which pmf is evaluated, i.e. Thanks. I discovered it just after your reply. I must admit that I find it more natural to first specify the distribution's parameters, such as mu for the Poisson distribution, and then specify the points at which to evaluate the pmf (or cdf, etc.) This explains the error. > >>>> stats.poisson.pmf(grid, 10) > array([ 0.0000453999297625, ?0.0004539992976248, ?0.0022699964881242, > ? ? ? ?0.0075666549604141, ?0.0189166374010354, ?0.0378332748020708, > ? ? ? ?0.0630554580034512, ?0.090079225719216 , ?0.1125990321490201, > ? ? ? ?0.1251100357211337, ?0.1251100357211337, ?0.1137363961101213, > ? ? ? ?0.094780330091768 , ?0.0729079462244373, ?0.0520771044460262, > ? ? ? ?0.0347180696306844, ?0.0216987935191777, ?0.0127639961877516, > ? ? ? ?0.0070911089931953, ?0.0037321626279975]) > > in the first case it's a frozen distribution > >>>> stats.poisson(10).pmf(grid) > array([ 0.0000453999297625, ?0.0004539992976248, ?0.0022699964881242, > ? ? ? ?0.0075666549604141, ?0.0189166374010354, ?0.0378332748020708, > ? ? ? ?0.0630554580034512, ?0.090079225719216 , ?0.1125990321490201, > ? ? ? ?0.1251100357211337, ?0.1251100357211337, ?0.1137363961101213, > ? ? ? ?0.094780330091768 , ?0.0729079462244373, ?0.0520771044460262, > ? ? ? ?0.0347180696306844, ?0.0216987935191777, ?0.0127639961877516, > ? ? ? ?0.0070911089931953, ?0.0037321626279975]) > > Josef > >> >> In [7]: >> >> >> So, in line [5], rv.pmf(grid)[0] is a number, while in [6], >> poisson.pmf(10,grid)[0] is nan. Am I doing something wrong, or is this >> an unintentional inconsistency? >> >> Nicky >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Sun Feb 5 16:35:27 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 5 Feb 2012 16:35:27 -0500 Subject: [SciPy-User] scipy.stats.poisson, strange output? In-Reply-To: References: Message-ID: On Sun, Feb 5, 2012 at 3:41 PM, nicky van foreest wrote: > Hi Josef, > >> wrong sequence of arguments, the shape (mean) argument should be >> second and first the values at which pmf is evaluated, i.e. > > Thanks. I discovered it just after your reply. I must admit that I > find it more natural to first specify the distribution's parameters, > such as mu for the Poisson distribution, and then specify the points > at which to evaluate the pmf (or cdf, etc.) This explains the error. I saw thatt you replied at the same time. I find the current version easier to follow, reading "given the parameters" pmf(x, theta) = Prob(x | theta) cdf(x, theta) = F(x | theta) pdf(x, theta) = f(x | theta) (and we could even pretend we are Bayesians :) Cheers, Josef > >> >>>>> stats.poisson.pmf(grid, 10) >> array([ 0.0000453999297625, ?0.0004539992976248, ?0.0022699964881242, >> ? ? ? ?0.0075666549604141, ?0.0189166374010354, ?0.0378332748020708, >> ? ? ? ?0.0630554580034512, ?0.090079225719216 , ?0.1125990321490201, >> ? ? ? ?0.1251100357211337, ?0.1251100357211337, ?0.1137363961101213, >> ? ? ? ?0.094780330091768 , ?0.0729079462244373, ?0.0520771044460262, >> ? ? ? ?0.0347180696306844, ?0.0216987935191777, ?0.0127639961877516, >> ? ? ? ?0.0070911089931953, ?0.0037321626279975]) >> >> in the first case it's a frozen distribution >> >>>>> stats.poisson(10).pmf(grid) >> array([ 0.0000453999297625, ?0.0004539992976248, ?0.0022699964881242, >> ? ? ? ?0.0075666549604141, ?0.0189166374010354, ?0.0378332748020708, >> ? ? ? ?0.0630554580034512, ?0.090079225719216 , ?0.1125990321490201, >> ? ? ? ?0.1251100357211337, ?0.1251100357211337, ?0.1137363961101213, >> ? ? ? ?0.094780330091768 , ?0.0729079462244373, ?0.0520771044460262, >> ? ? ? ?0.0347180696306844, ?0.0216987935191777, ?0.0127639961877516, >> ? ? ? ?0.0070911089931953, ?0.0037321626279975]) >> >> Josef >> >>> >>> In [7]: >>> >>> >>> So, in line [5], rv.pmf(grid)[0] is a number, while in [6], >>> poisson.pmf(10,grid)[0] is nan. Am I doing something wrong, or is this >>> an unintentional inconsistency? >>> >>> Nicky >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From vanforeest at gmail.com Sun Feb 5 17:00:16 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sun, 5 Feb 2012 23:00:16 +0100 Subject: [SciPy-User] scipy.stats.poisson, strange output? In-Reply-To: References: Message-ID: > I find the current version easier to follow, reading "given the parameters" This is a good hint to memorize the proper sequence. > (and we could even pretend we are Bayesians :) I am some sort of a Bayesian :-) Nicky > > Cheers, > > Josef > >> >>> >>>>>> stats.poisson.pmf(grid, 10) >>> array([ 0.0000453999297625, ?0.0004539992976248, ?0.0022699964881242, >>> ? ? ? ?0.0075666549604141, ?0.0189166374010354, ?0.0378332748020708, >>> ? ? ? ?0.0630554580034512, ?0.090079225719216 , ?0.1125990321490201, >>> ? ? ? ?0.1251100357211337, ?0.1251100357211337, ?0.1137363961101213, >>> ? ? ? ?0.094780330091768 , ?0.0729079462244373, ?0.0520771044460262, >>> ? ? ? ?0.0347180696306844, ?0.0216987935191777, ?0.0127639961877516, >>> ? ? ? ?0.0070911089931953, ?0.0037321626279975]) >>> >>> in the first case it's a frozen distribution >>> >>>>>> stats.poisson(10).pmf(grid) >>> array([ 0.0000453999297625, ?0.0004539992976248, ?0.0022699964881242, >>> ? ? ? ?0.0075666549604141, ?0.0189166374010354, ?0.0378332748020708, >>> ? ? ? ?0.0630554580034512, ?0.090079225719216 , ?0.1125990321490201, >>> ? ? ? ?0.1251100357211337, ?0.1251100357211337, ?0.1137363961101213, >>> ? ? ? ?0.094780330091768 , ?0.0729079462244373, ?0.0520771044460262, >>> ? ? ? ?0.0347180696306844, ?0.0216987935191777, ?0.0127639961877516, >>> ? ? ? ?0.0070911089931953, ?0.0037321626279975]) >>> >>> Josef >>> >>>> >>>> In [7]: >>>> >>>> >>>> So, in line [5], rv.pmf(grid)[0] is a number, while in [6], >>>> poisson.pmf(10,grid)[0] is nan. Am I doing something wrong, or is this >>>> an unintentional inconsistency? >>>> >>>> Nicky >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From lafont.fabien at gmail.com Mon Feb 6 04:41:20 2012 From: lafont.fabien at gmail.com (Fabien Lafont) Date: Mon, 6 Feb 2012 10:41:20 +0100 Subject: [SciPy-User] [scipy-user] How to apply a condition on some specific values of an array In-Reply-To: References: Message-ID: And is it possible to apply a specific operation with a condition (if). I have to apply different operations on the same array depending on the value of the array element. For example: I have an array like that: [100 250 501 700] and I want to multiply by 100 the value if this value is smalest thant 500 and multiply by 1000 if the value is bigest. Fabien 2012/2/3 Fabien Lafont : > thx Scott! > > 2012/2/3 Scott Sinclair : >> On 3 February 2012 12:48, Fabien Lafont wrote: >>> I'm just starting to use numpy and I have still Python's reflexes so I >>> want to know how can I do the following code using Numpy "style". >>> >>> for i in range(0,len(array)+1): >>> ? ? ? if 10>> ? ? ? ? ? ?new_array = array[i]*1000 >>> >>> In other words is it possible to "scan" the values of an array and >>> apply a "modification" to it if the condition is true >> >> Yes - you can use fancy indexing (see >> http://docs.scipy.org/doc/numpy/user/basics.indexing.html) >> >> In[1]: import numpy as np >> >> In[2]: arr = np.arange(10) >> In[3]: arr >> Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In[4]: arr[(2 < arr) & (arr < 9)] *= 1000 >> In[5]: arr >> Out[5]: array([ ? 0, ? ?1, ? ?2, 3000, 4000, 5000, 6000, 7000, 8000, ? ?9]) >> >> Cheers, >> Scott >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From scott.sinclair.za at gmail.com Mon Feb 6 05:13:32 2012 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Mon, 6 Feb 2012 12:13:32 +0200 Subject: [SciPy-User] [scipy-user] How to apply a condition on some specific values of an array In-Reply-To: References: Message-ID: On 6 February 2012 11:41, Fabien Lafont wrote: > And is it possible to apply a specific operation with a condition > (if). I have to apply different operations on the same array depending > on the value of the array element. > > For example: I have an array like that: > > [100 ?250 ?501 ?700] and I want to multiply by 100 the value if this > value is smalest thant 500 and multiply by 1000 if the value is > bigest. Here's one way that should be easy to follow. You'll have to make a copy of your array (as shown here), or generate two index arrays before modifying your original array. In [1]: arr = np.array([100, 250, 501, 700]) In [2]: # make a copy to avoid aliasing ...: new_arr = np.array(arr) In [3]: new_arr[arr < 500] *= 100 In [4]: new_arr[arr > 500] *= 1000 In [5]: arr Out[5]: array([100, 250, 501, 700]) In [6]: new_arr Out[6]: array([ 10000, 25000, 501000, 700000]) Cheers, Scott From lafont.fabien at gmail.com Mon Feb 6 05:27:20 2012 From: lafont.fabien at gmail.com (Fabien Lafont) Date: Mon, 6 Feb 2012 11:27:20 +0100 Subject: [SciPy-User] [scipy-user] How to apply a condition on some specific values of an array In-Reply-To: References: Message-ID: Great, it seems really easy! Is it possible to append the values to new_arr because I have to do it with many "arr" so new_arr will be erase each time if I do: new_arr[arr < 500] *= 100 new_arr[arr2 < 500] *= 100 new_arr[arr3 < 500] *= 100 I've tried np.append but it doesn't work... Nevertheless thanks again! Fab 2012/2/6 Scott Sinclair : > On 6 February 2012 11:41, Fabien Lafont wrote: >> And is it possible to apply a specific operation with a condition >> (if). I have to apply different operations on the same array depending >> on the value of the array element. >> >> For example: I have an array like that: >> >> [100 ?250 ?501 ?700] and I want to multiply by 100 the value if this >> value is smalest thant 500 and multiply by 1000 if the value is >> bigest. > > Here's one way that should be easy to follow. You'll have to make a > copy of your array (as shown here), or generate two index arrays > before modifying your original array. > > In [1]: arr = np.array([100, 250, 501, 700]) > > In [2]: # make a copy to avoid aliasing > ? ...: new_arr = np.array(arr) > > In [3]: new_arr[arr < 500] *= 100 > > In [4]: new_arr[arr > 500] *= 1000 > > In [5]: arr > Out[5]: array([100, 250, 501, 700]) > > In [6]: new_arr > Out[6]: array([ 10000, ?25000, 501000, 700000]) > > Cheers, > Scott > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From lafont.fabien at gmail.com Mon Feb 6 06:31:25 2012 From: lafont.fabien at gmail.com (Fabien Lafont) Date: Mon, 6 Feb 2012 12:31:25 +0100 Subject: [SciPy-User] [scipy-user] How to apply a condition on some specific values of an array In-Reply-To: References: Message-ID: Sorry Scott, I managed to append values. I was using it like classical-list append function! Thx again, Fab 2012/2/6 Fabien Lafont : > Great, it seems really easy! > > Is it possible to append the values to new_arr because I have to do it > with many "arr" so new_arr will be erase each time if I do: > > new_arr[arr < 500] *= 100 > new_arr[arr2 < 500] *= 100 > new_arr[arr3 < 500] *= 100 > > I've tried np.append but it doesn't work... > > Nevertheless thanks again! > > Fab > > 2012/2/6 Scott Sinclair : >> On 6 February 2012 11:41, Fabien Lafont wrote: >>> And is it possible to apply a specific operation with a condition >>> (if). I have to apply different operations on the same array depending >>> on the value of the array element. >>> >>> For example: I have an array like that: >>> >>> [100 ?250 ?501 ?700] and I want to multiply by 100 the value if this >>> value is smalest thant 500 and multiply by 1000 if the value is >>> bigest. >> >> Here's one way that should be easy to follow. You'll have to make a >> copy of your array (as shown here), or generate two index arrays >> before modifying your original array. >> >> In [1]: arr = np.array([100, 250, 501, 700]) >> >> In [2]: # make a copy to avoid aliasing >> ? ...: new_arr = np.array(arr) >> >> In [3]: new_arr[arr < 500] *= 100 >> >> In [4]: new_arr[arr > 500] *= 1000 >> >> In [5]: arr >> Out[5]: array([100, 250, 501, 700]) >> >> In [6]: new_arr >> Out[6]: array([ 10000, ?25000, 501000, 700000]) >> >> Cheers, >> Scott >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From guyer at nist.gov Mon Feb 6 09:12:19 2012 From: guyer at nist.gov (Jonathan Guyer) Date: Mon, 6 Feb 2012 09:12:19 -0500 Subject: [SciPy-User] Can I copy a sparse matrix into an existing dense numpy matrix? In-Reply-To: References: Message-ID: <2B682631-3D88-4E0A-BA15-760B39F1F1A2@nist.gov> On Feb 5, 2012, at 10:21 AM, Warren Weckesser wrote: > > > On Sun, Feb 5, 2012 at 9:05 AM, Conrad Lee wrote: > Say I have a huge numpy matrix A taking up tens of gigabytes. It takes a non-negligible amount of time to allocate this memory. > > Let's say I also have a collection of scipy sparse matrices with the same dimensions as the numpy matrix. Sometimes I want to convert one of these sparse matrices into a dense matrix to perform some vectorized operations that can't be performed on sparse matrices. > > Can I load one of these sparse matrices into A rather than re-allocate space each time I want to convert a sparse matrix into a dense matrix? The .toarray() and .todense() methods which are available on scipy sparse matrices do not seem to take an optional dense array argument, but maybe there is some other way to do this. > > (I've also started a stackoverflow version of this question here.) > > Thanks, > > Conrad lee > > > > If your sparse matrix is in coo format, you can use fancy indexing to assign the values to the existing array. Although, unless your sparsity pattern doesn't change (which it may not), you'll need to zero the entire dense array before reassigning, which will also take "a non-negligible amount of time". From conradlee at gmail.com Mon Feb 6 09:56:22 2012 From: conradlee at gmail.com (Conrad Lee) Date: Mon, 6 Feb 2012 14:56:22 +0000 Subject: [SciPy-User] Can I copy a sparse matrix into an existing dense numpy matrix? In-Reply-To: <2B682631-3D88-4E0A-BA15-760B39F1F1A2@nist.gov> References: <2B682631-3D88-4E0A-BA15-760B39F1F1A2@nist.gov> Message-ID: Warren, thanks for the suggestion with the COO matrix. In general I'm storing sparse matrices in the CSR format for quick multiplication, so your approach would mean that I have to convert to a COO matrix every time, but that conversion is pretty quick. Although, unless your sparsity pattern doesn't change (which it may not), > you'll need to zero the entire dense array before reassigning, which will > also take "a non-negligible amount of time". > Zeroing out a matrix seems to happen very quickly, probably because it's a vectorized operation taking advantage of the SIMD instructions on modern processors. As far as I understand it, allocating huge amounts of memory requires slower operations. I did a quick and dirty benchmark, and zeroing takes a small fraction of the time of allocating. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guyer at nist.gov Mon Feb 6 10:11:33 2012 From: guyer at nist.gov (Jonathan Guyer) Date: Mon, 6 Feb 2012 10:11:33 -0500 Subject: [SciPy-User] Can I copy a sparse matrix into an existing dense numpy matrix? In-Reply-To: References: <2B682631-3D88-4E0A-BA15-760B39F1F1A2@nist.gov> Message-ID: On Feb 6, 2012, at 9:56 AM, Conrad Lee wrote: > I did a quick and dirty benchmark, and zeroing takes a small fraction of the time of allocating. Good to know. From warren.weckesser at enthought.com Mon Feb 6 11:21:09 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 6 Feb 2012 10:21:09 -0600 Subject: [SciPy-User] Can I copy a sparse matrix into an existing dense numpy matrix? In-Reply-To: References: <2B682631-3D88-4E0A-BA15-760B39F1F1A2@nist.gov> Message-ID: On Mon, Feb 6, 2012 at 8:56 AM, Conrad Lee wrote: > Warren, thanks for the suggestion with the COO matrix. In general I'm > storing sparse matrices in the CSR format for quick multiplication, so your > approach would mean that I have to convert to a COO matrix every time, but > that conversion is pretty quick. > Conrad, Here's an example of how you could do the assignment directly with a CSR matrix: import numpy as np from scipy.sparse import csr_matrix # 'c' is a sparse matrix in CSR format. c = csr_matrix([[0,0,1,0,0,0], [0,2,0,3,0,0], [0,0,0,0,0,0], [4,0,0,0,5,0]]) # 'a' is the dense array into which we'll copy the nonzero # elements of 'c' a = np.zeros(c.shape, dtype=c.dtype) # The next line is the key part: it converts c.indptr into # the row indices in the dense array. (c.indices already has # the columns.) rows = sum((m*[k] for k, m in enumerate(np.diff(c.indptr))), []) a[rows, c.indices] = c.data print c.todense() print a print np.all(c.todense() == a) This might be more efficient than converting to COO. Warren > Although, unless your sparsity pattern doesn't change (which it may not), >> you'll need to zero the entire dense array before reassigning, which will >> also take "a non-negligible amount of time". >> > > Zeroing out a matrix seems to happen very quickly, probably because it's a > vectorized operation taking advantage of the SIMD instructions on modern > processors. As far as I understand it, allocating huge amounts of memory > requires slower operations. I did a quick and dirty benchmark, and zeroing > takes a small fraction of the time of allocating. > > >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vaggi.federico at gmail.com Tue Feb 7 04:18:00 2012 From: vaggi.federico at gmail.com (federico vaggi) Date: Tue, 7 Feb 2012 10:18:00 +0100 Subject: [SciPy-User] From Delaunay edges to spatial points Message-ID: Hello, I am a relative newbie to tessellation, so I might be asking a very naive question. I have an unweighted graph (list of nodes, edge lists) that I would like to plot on the surface of a sphere. Given the edge list, is there a way to come up with a x,y,z position of all the nodes so that they follow a Delaunay tessellation? Most software I've seen starts from positions in space and then tries to obtain the edge list - I'd like to do the inverse, if possible. I am not 100% sure if this is more appropriate for the scipy mailing list or the networkx mailing list, so I think I'll post it in both places as long as that's not frowned upon. Thank you very much, Fede -------------- next part -------------- An HTML attachment was scrubbed... URL: From lafont.fabien at gmail.com Tue Feb 7 05:43:04 2012 From: lafont.fabien at gmail.com (Fabien Lafont) Date: Tue, 7 Feb 2012 11:43:04 +0100 Subject: [SciPy-User] [scipy-user] How to use genfromtext() with np.array? Message-ID: I've saved a np.array in a file using write(). Ihave then a file with my np.array over 8 columns and I can't load it using genfromtext to load at the same time the entire array. It seems genfromtext doesn't "see" the array as a real array but as 8 different columns. Is it possible to load the array easily with genfromtext, or save my array in a different way. It works with a for loop over each indices of the array with + "\n" but it's not very convenient. Thx, Fabien From papuu_k at yahoo.com Tue Feb 7 05:43:34 2012 From: papuu_k at yahoo.com (Pappu Kumar) Date: Tue, 7 Feb 2012 16:13:34 +0530 (IST) Subject: [SciPy-User] Fitting Differential Equations to a Curve Message-ID: <1328611414.88237.YahooMailNeo@web137608.mail.in.yahoo.com> I am trying to fit the differential equation ay' + by''=0 to a curve by varying a and b The following code does not work: from scipy.integrate import odeint from scipy.optimize import curve_fit from numpy import linspace, random, array time = linspace(0.0,10.0,100) def deriv(time,a,b): ??? dy=lambda y,t : array([ y[1], a*y[0]+b*y[1] ]) ??? yinit = array([0.0005,0.2]) # initial values ??? Y=odeint(dy,yinit,time) ??? return Y[:,0] z = deriv(time, 2, 0.1) zn = z + 0.1*random.normal(size=len(time)) popt, pcov = curve_fit(deriv, time, zn) print popt # it only outputs the initial values of a, b! -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.anton.letnes at gmail.com Tue Feb 7 09:38:01 2012 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Tue, 7 Feb 2012 15:38:01 +0100 Subject: [SciPy-User] [scipy-user] How to use genfromtext() with np.array? In-Reply-To: References: Message-ID: <82CA1654-A13C-44B7-8184-6599E8B91DE6@gmail.com> The easiest way is probably savetxt/loadtxt: In [1]: d = np.linspace(0,1,10) In [2]: np.savetxt('foo', d) In [3]: d2 = np.loadtxt('foo') In [4]: d-d2 Out[4]: array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]) It should work equally well with 2D arrays. Paul On 7. feb. 2012, at 11:43, Fabien Lafont wrote: > I've saved a np.array in a file using write(). Ihave then a file with > my np.array over 8 columns and I can't load it using genfromtext to > load at the same time the entire array. It seems genfromtext doesn't > "see" the array as a real array but as 8 different columns. Is it > possible to load the array easily with genfromtext, or save my array > in a different way. It works with a for loop over each indices of the > array with + "\n" but it's not very convenient. > > Thx, > > Fabien > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From lafont.fabien at gmail.com Tue Feb 7 09:39:41 2012 From: lafont.fabien at gmail.com (Fabien Lafont) Date: Tue, 7 Feb 2012 15:39:41 +0100 Subject: [SciPy-User] [scipy-user] How to use genfromtext() with np.array? In-Reply-To: <82CA1654-A13C-44B7-8184-6599E8B91DE6@gmail.com> References: <82CA1654-A13C-44B7-8184-6599E8B91DE6@gmail.com> Message-ID: Thank you very much, it works perfectly! 2012/2/7 Paul Anton Letnes : > The easiest way is probably savetxt/loadtxt: > In [1]: d = np.linspace(0,1,10) > > In [2]: np.savetxt('foo', d) > > In [3]: d2 = np.loadtxt('foo') > > In [4]: d-d2 > Out[4]: array([ 0., ?0., ?0., ?0., ?0., ?0., ?0., ?0., ?0., ?0.]) > > It should work equally well with 2D arrays. > > Paul > > On 7. feb. 2012, at 11:43, Fabien Lafont wrote: > >> I've saved a np.array in a file using write(). Ihave then a file with >> my np.array over 8 columns and I can't load it using genfromtext to >> load at the same time the entire array. It seems genfromtext doesn't >> "see" the array as a real array but as 8 different columns. Is it >> possible to load the array easily with genfromtext, or save my array >> in a different way. It works with a for loop over each indices of the >> array with + "\n" but it's not very convenient. >> >> Thx, >> >> Fabien >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pav at iki.fi Tue Feb 7 14:04:00 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 07 Feb 2012 20:04:00 +0100 Subject: [SciPy-User] Fitting Differential Equations to a Curve In-Reply-To: <1328611414.88237.YahooMailNeo@web137608.mail.in.yahoo.com> References: <1328611414.88237.YahooMailNeo@web137608.mail.in.yahoo.com> Message-ID: 07.02.2012 11:43, Pappu Kumar kirjoitti: > I am trying to fit the differential equation ay' + by''=0 to a curve by > varying a and b The following code does not work: > > from scipy.integrate import odeint > from scipy.optimize import curve_fit > from numpy import linspace, random, array > > time = linspace(0.0,10.0,100) > def deriv(time,a,b): > dy=lambda y,t : array([ y[1], a*y[0]+b*y[1] ]) > yinit = array([0.0005,0.2]) # initial values > Y=odeint(dy,yinit,time) > return Y[:,0] > > z = deriv(time, 2, 0.1) > zn = z + 0.1*random.normal(size=len(time)) > > popt, pcov = curve_fit(deriv, time, zn) > print popt # it only outputs the initial values of a, b! For me, this prints [ 1.999963 0.10002353] So seems to be working as expected, as the function to fit was made with parameters [2, 0.1] plus some added noise. -- Pauli Virtanen From wesmckinn at gmail.com Tue Feb 7 17:38:43 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 7 Feb 2012 17:38:43 -0500 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: On Sun, Feb 5, 2012 at 2:17 AM, Alexander Kalinin wrote: > Yes, the numpy.take() is much faster than "fancy" indexing and now "pure > numpy" solution is two time faster than pandas. Below are timing results: > > > The data shape: > ? ?? (1062, 6348) > > Pandas solution: > ??? 0.16610 seconds > > "Pure numpy" solution: > ??? 0.08907 seconds > > Timing of the "pure numpy" by blocks: > block (a) (sorting and obtaining groups): > ??? 0.00134 seconds > block (b) (copy data to the ordered_data): > ??? 0.05517 seconds > block (c) (reduceat): > ??? 0.02698 > > Alexander. > > > On Sun, Feb 5, 2012 at 4:01 AM, wrote: >> >> On Sat, Feb 4, 2012 at 2:27 PM, Wes McKinney wrote: >> > On Sat, Feb 4, 2012 at 2:23 PM, Alexander Kalinin >> > wrote: >> >> I have checked the performance of the "pure numpy" solution with pandas >> >> solution on my task. The "pure numpy" solution is about two times >> >> slower. >> >> >> >> The data shape: >> >> ??? (1062, 6348) >> >> Pandas "group by sum" time: >> >> ??? 0.16588 seconds >> >> Pure numpy "group by sum" time: >> >> ??? 0.38979 seconds >> >> >> >> But it is interesting, that the main bottleneck in numpy solution is >> >> the >> >> data copying. I have divided solution on three blocks: >> >> >> >> # block (a): >> >> ? ? s = np.argsort(labels) >> >> >> >> keys, inv = np.unique(labels, return_inverse = True) >> >> >> >> i = inv[s] >> >> >> >> groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] >> >> >> >> >> >> # block (b): >> >> ??? ordered_data = data[:, s] >> >> can you try with numpy.take? Keith and Wes were showing that take is >> much faster than advanced indexing. >> >> Josef >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > FWIW I did some refactoring in pandas today and am getting the following timings now in this use case: In [12]: df = DataFrame(randn(1062, 6348)) In [13]: labels = np.random.randint(0, 100, size=1062) In [14]: timeit df.groupby(labels).sum() 10 loops, best of 3: 38.7 ms per loop comparing with def numpy_groupby(data, labels, axis=0): s = np.argsort(labels) keys, inv = np.unique(labels, return_inverse = True) i = inv.take(s) groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] ordered_data = data.take(s, axis=axis) group_sums = np.add.reduceat(ordered_data, groups_at, axis=axis) return group_sums In [15]: timeit numpy_groupby(df.values, labels) 10 loops, best of 3: 95.4 ms per loop according to line_profiler, the runtime is being consumed by the reduceat now In [20]: lprun -f numpy_groupby numpy_groupby(df.values, labels) Timer unit: 1e-06 s File: pandas/core/groupby.py Function: numpy_groupby at line 1511 Total time: 0.108126 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 1511 def numpy_groupby(data, labels): 1512 1 125 125.0 0.1 s = np.argsort(labels) 1513 1 720 720.0 0.7 keys, inv = np.unique(labels, return_inverse = True) 1514 1 13 13.0 0.0 i = inv.take(s) 1515 1 62 62.0 0.1 groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] 1516 1 28684 28684.0 26.5 ordered_data = data.take(s, axis=0) 1517 1 78519 78519.0 72.6 group_sums = np.add.reduceat(ordered_data, groups_at, axis=0) 1518 1519 1 3 3.0 0.0 return group_sums The performance of the pure numpy solution will degrade both with the length of the labels vector and the number of unique elements (because there are two O(N log N) steps there). In this case it matters less because there are so many rows / columns to aggregate The performance of the pure NumPy solution is unsurprisingly much better when the aggregation is across contiguous memory vs. strided memory access: In [41]: labels = np.random.randint(0, 100, size=6348) In [42]: timeit numpy_groupby(df.values, labels, axis=1) 10 loops, best of 3: 47.4 ms per loop pandas is slower in this case because it's not giving any consideration to cache locality: In [50]: timeit df.groupby(labels, axis=1).sum() 10 loops, best of 3: 79.9 ms per loop One can only complain so much about 30 lines of Cython code ;) Good enough for the time being - Wes From wesmckinn at gmail.com Tue Feb 7 18:15:04 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 7 Feb 2012 18:15:04 -0500 Subject: [SciPy-User] Accumulation sum using indirect indexes In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 5:38 PM, Wes McKinney wrote: > On Sun, Feb 5, 2012 at 2:17 AM, Alexander Kalinin > wrote: >> Yes, the numpy.take() is much faster than "fancy" indexing and now "pure >> numpy" solution is two time faster than pandas. Below are timing results: >> >> >> The data shape: >> ? ?? (1062, 6348) >> >> Pandas solution: >> ??? 0.16610 seconds >> >> "Pure numpy" solution: >> ??? 0.08907 seconds >> >> Timing of the "pure numpy" by blocks: >> block (a) (sorting and obtaining groups): >> ??? 0.00134 seconds >> block (b) (copy data to the ordered_data): >> ??? 0.05517 seconds >> block (c) (reduceat): >> ??? 0.02698 >> >> Alexander. >> >> >> On Sun, Feb 5, 2012 at 4:01 AM, wrote: >>> >>> On Sat, Feb 4, 2012 at 2:27 PM, Wes McKinney wrote: >>> > On Sat, Feb 4, 2012 at 2:23 PM, Alexander Kalinin >>> > wrote: >>> >> I have checked the performance of the "pure numpy" solution with pandas >>> >> solution on my task. The "pure numpy" solution is about two times >>> >> slower. >>> >> >>> >> The data shape: >>> >> ??? (1062, 6348) >>> >> Pandas "group by sum" time: >>> >> ??? 0.16588 seconds >>> >> Pure numpy "group by sum" time: >>> >> ??? 0.38979 seconds >>> >> >>> >> But it is interesting, that the main bottleneck in numpy solution is >>> >> the >>> >> data copying. I have divided solution on three blocks: >>> >> >>> >> # block (a): >>> >> ? ? s = np.argsort(labels) >>> >> >>> >> keys, inv = np.unique(labels, return_inverse = True) >>> >> >>> >> i = inv[s] >>> >> >>> >> groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] >>> >> >>> >> >>> >> # block (b): >>> >> ??? ordered_data = data[:, s] >>> >>> can you try with numpy.take? Keith and Wes were showing that take is >>> much faster than advanced indexing. >>> >>> Josef >>> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > FWIW I did some refactoring in pandas today and am getting the > following timings now in this use case: > > In [12]: df = DataFrame(randn(1062, 6348)) > > In [13]: labels = np.random.randint(0, 100, size=1062) > > In [14]: timeit df.groupby(labels).sum() > 10 loops, best of 3: 38.7 ms per loop > > comparing with > > def numpy_groupby(data, labels, axis=0): > ? ?s = np.argsort(labels) > ? ?keys, inv = np.unique(labels, return_inverse = True) > ? ?i = inv.take(s) > ? ?groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] > ? ?ordered_data = data.take(s, axis=axis) > ? ?group_sums = np.add.reduceat(ordered_data, groups_at, axis=axis) > > ? ?return group_sums > > In [15]: timeit numpy_groupby(df.values, labels) > 10 loops, best of 3: 95.4 ms per loop > > according to line_profiler, the runtime is being consumed by the reduceat now > > In [20]: lprun -f numpy_groupby numpy_groupby(df.values, labels) > Timer unit: 1e-06 s > > File: pandas/core/groupby.py > Function: numpy_groupby at line 1511 > Total time: 0.108126 s > > Line # ? ? ?Hits ? ? ? ? Time ?Per Hit ? % Time ?Line Contents > ============================================================== > ?1511 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? def > numpy_groupby(data, labels): > ?1512 ? ? ? ? 1 ? ? ? ? ?125 ? ?125.0 ? ? ?0.1 ? ? ?s = np.argsort(labels) > ?1513 ? ? ? ? 1 ? ? ? ? ?720 ? ?720.0 ? ? ?0.7 ? ? ?keys, inv = > np.unique(labels, return_inverse = True) > ?1514 ? ? ? ? 1 ? ? ? ? ? 13 ? ? 13.0 ? ? ?0.0 ? ? ?i = inv.take(s) > ?1515 ? ? ? ? 1 ? ? ? ? ? 62 ? ? 62.0 ? ? ?0.1 ? ? ?groups_at = > np.where(i != np.concatenate(([-1], i[:-1])))[0] > ?1516 ? ? ? ? 1 ? ? ? ?28684 ?28684.0 ? ? 26.5 ? ? ?ordered_data = > data.take(s, axis=0) > ?1517 ? ? ? ? 1 ? ? ? ?78519 ?78519.0 ? ? 72.6 ? ? ?group_sums = > np.add.reduceat(ordered_data, groups_at, axis=0) > ?1518 > ?1519 ? ? ? ? 1 ? ? ? ? ? ?3 ? ? ?3.0 ? ? ?0.0 ? ? ?return group_sums > > The performance of the pure numpy solution will degrade both with the > length of the labels vector and the number of unique elements (because > there are two O(N log N) steps there). In this case it matters less > because there are so many rows / columns to aggregate > > The performance of the pure NumPy solution is unsurprisingly much > better when the aggregation is across contiguous memory vs. strided > memory access: > > > In [41]: labels = np.random.randint(0, 100, size=6348) > > In [42]: timeit numpy_groupby(df.values, labels, axis=1) > 10 loops, best of 3: 47.4 ms per loop > > pandas is slower in this case because it's not giving any > consideration to cache locality: > > In [50]: timeit df.groupby(labels, axis=1).sum() > 10 loops, best of 3: 79.9 ms per loop > > One can only complain so much about 30 lines of Cython code ;) Good > enough for the time being > > - Wes More on this for those interested. These methods start really becoming different when you aggregate very large 1D arrays. Consider a million float64s each with a label chosen from 1000 unique labels. You can see where we start running into problems: In [9]: data = np.random.randn(1000000, 1) In [10]: labels = np.random.randint(0, 1000, size=1000000) In [11]: lprun -f gp.numpy_groupby gp.numpy_groupby(data, labels) Timer unit: 1e-06 s File: pandas/core/groupby.py Function: numpy_groupby at line 1512 Total time: 0.413775 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 1512 def numpy_groupby(data, labels, axis=0): 1513 1 98867 98867.0 23.9 s = np.argsort(labels) 1514 1 286792 286792.0 69.3 keys, inv = np.unique(labels, return_inverse = True) 1515 1 9617 9617.0 2.3 i = inv.take(s) 1516 1 8059 8059.0 1.9 groups_at = np.where(i != np.concatenate(([-1], i[:-1])))[0] 1517 1 9365 9365.0 2.3 ordered_data = data.take(s, axis=axis) 1518 1 1073 1073.0 0.3 group_sums = np.add.reduceat(ordered_data, groups_at, axis=axis) 1519 1520 1 2 2.0 0.0 return group_sums In [12]: timeit gp.numpy_groupby(data, labels) 1 loops, best of 3: 410 ms per loop whereas the hash-table based approach (a la pandas) looks like: In [13]: df = DataFrame(data) In [14]: timeit df.groupby(labels).sum() 10 loops, best of 3: 71.5 ms per loop with In [17]: %prun -s cumulative -l 15 for _ in xrange(10): df.groupby(labels).sum() 3002 function calls in 0.771 seconds Ordered by: cumulative time List reduced from 109 to 15 due to restriction <15> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.025 0.025 0.771 0.771 :1() 10 0.000 0.000 0.744 0.074 groupby.py:327(sum) 10 0.001 0.000 0.744 0.074 groupby.py:940(_cython_agg_general) 10 0.000 0.000 0.692 0.069 groupby.py:384(_group_info) 10 0.000 0.000 0.408 0.041 groupby.py:620(labels) 10 0.000 0.000 0.408 0.041 groupby.py:641(_make_labels) 10 0.015 0.002 0.408 0.041 algorithms.py:72(factorize) 10 0.314 0.031 0.314 0.031 {method 'get_labels' of 'pandas._tseries.Int64HashTable' objects} 10 0.012 0.001 0.284 0.028 groupby.py:1470(_compress_group_index) 10 0.178 0.018 0.178 0.018 {method 'get_labels_groupby' of 'pandas._tseries.Int64HashTable' objects} 90 0.132 0.001 0.132 0.001 {method 'take' of 'numpy.ndarray' objects} 10 0.000 0.000 0.045 0.004 groupby.py:1432(cython_aggregate) 10 0.044 0.004 0.044 0.004 {pandas._tseries.group_add} 30 0.019 0.001 0.019 0.001 {numpy.core.multiarray.putmask} 20 0.016 0.001 0.016 0.001 {method 'astype' of 'numpy.ndarray' objects} I'm working on getting rid of the "compress" step which will save another 30% in this single-key groupby case, unfortunately with the way I have the code factored it's not completely trivial (this is all very bleeding-edge pandas, forthcoming in 0.7.0 final) - Wes From k-assem84 at hotmail.com Tue Feb 7 09:39:18 2012 From: k-assem84 at hotmail.com (suzana8447) Date: Tue, 7 Feb 2012 06:39:18 -0800 (PST) Subject: [SciPy-User] [SciPy-user] How to use Least square fit to fit three functions Message-ID: <33278985.post@talk.nabble.com> Hello, I would really appreciate if someone suggest me how to use least square fit (scipy) to fit three functions at the same time because i looked at the website site of scipy-least square: http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html and I realized that lesatsq takes only one function as an argument. Is there any way to let leastsq fit three functions for example? Thanks in advance. -- View this message in context: http://old.nabble.com/How-to-use-Least-square-fit-to-fit-three-functions-tp33278985p33278985.html Sent from the Scipy-User mailing list archive at Nabble.com. From tyler104 at gmail.com Tue Feb 7 18:58:40 2012 From: tyler104 at gmail.com (Tyler) Date: Tue, 7 Feb 2012 18:58:40 -0500 Subject: [SciPy-User] Repeated measures scipy.stats support? Message-ID: I have the following question, and am wondering if its possible to solve using existing scipy.stats, or anything else that anyone knows about =) If I have n subjects that I have an Active-Control measurement for, for instance "(Minutes it takes to complete 1st lap) - (Minutes it takes to complete second lap)", an each subject runs 1-4 times, how do I properly calculate a p value for "Is one lap faster than the other?" Example data: Subject# 1stLap-2ndLap Minutes 1 .3 1 -.1 2 .2 2 .4 2 -.3 3 .6 4 .2 4 -.2 4 .1 4 .6 5 .5 5 -.4 6 .2 6 .1 -Thanks a million -Tyler -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Feb 8 09:21:38 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Feb 2012 09:21:38 -0500 Subject: [SciPy-User] [SciPy-user] How to use Least square fit to fit three functions In-Reply-To: <33278985.post@talk.nabble.com> References: <33278985.post@talk.nabble.com> Message-ID: On Tue, Feb 7, 2012 at 9:39 AM, suzana8447 wrote: > > Hello, > > I would really appreciate if someone suggest me how to use least square fit > (scipy) to fit three functions at the same time because i looked at the > website site of scipy-least square: > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html > > and I realized that lesatsq takes only one function as an argument. Is there > any way to let leastsq fit three functions for example? stack them into one array If you need different weights for the 3 functions, then the easiest would be to use optimize curve_fit. Josef > > Thanks in advance. > -- > View this message in context: http://old.nabble.com/How-to-use-Least-square-fit-to-fit-three-functions-tp33278985p33278985.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From lists at hilboll.de Thu Feb 9 07:52:35 2012 From: lists at hilboll.de (Andreas H.) Date: Thu, 9 Feb 2012 13:52:35 +0100 Subject: [SciPy-User] ask.scipy.org registration not working Message-ID: <63b77ef8929b228e5c125438645264c4.squirrel@srv2.s4y.tournesol-consulting.eu> Hi, I'm running Chromium 16.0.912.77 (Developer Build 118311 Linux) Ubuntu 10.04. When registering for ask.scipy.org, I always get the error message "You entered an invalid captcha". However, I don't see any captcha where I could enter anything. The ReCaptcha widget is just not shown. Trying the same with Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.9.168 Version/11.52, I do see the ReCaptcha widget. However, when I enter a correct captcha and click "Register", I'm running into some sort of timeout after about a minute or two. The error message is Proxy Error The proxy server received an invalid response from an upstream server. The proxy server could not handle the request POST /login. Reason: Error reading from remote server Apache/2.2.3 (CentOS) Server at ask.scipy.org Port 80 Cheers, Andreas. From hturesson at gmail.com Thu Feb 9 09:01:07 2012 From: hturesson at gmail.com (Hjalmar Turesson) Date: Thu, 9 Feb 2012 09:01:07 -0500 Subject: [SciPy-User] ask.scipy.org registration not working In-Reply-To: <63b77ef8929b228e5c125438645264c4.squirrel@srv2.s4y.tournesol-consulting.eu> References: <63b77ef8929b228e5c125438645264c4.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: I have same problem (invalid captcha) using both Chromium and Firefox 10. After a half year of trying I still have not been able to log in. Best, Hjalmar On Thu, Feb 9, 2012 at 7:52 AM, Andreas H. wrote: > Hi, > > I'm running Chromium 16.0.912.77 (Developer Build 118311 Linux) Ubuntu > 10.04. When registering for ask.scipy.org, I always get the error message > "You entered an invalid captcha". However, I don't see any captcha where I > could enter anything. The ReCaptcha widget is just not shown. > > Trying the same with Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.9.168 > Version/11.52, I do see the ReCaptcha widget. However, when I enter a > correct captcha and click "Register", I'm running into some sort of > timeout after about a minute or two. The error message is > > Proxy Error > The proxy server received an invalid response from an upstream server. > The proxy server could not handle the request POST /login. > Reason: Error reading from remote server > Apache/2.2.3 (CentOS) Server at ask.scipy.org Port 80 > > Cheers, > Andreas. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwatford at gmail.com Thu Feb 9 09:35:30 2012 From: kwatford at gmail.com (Ken Watford) Date: Thu, 9 Feb 2012 09:35:30 -0500 Subject: [SciPy-User] ask.scipy.org registration not working In-Reply-To: References: <63b77ef8929b228e5c125438645264c4.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: Look at the current list of questions. Nobody has asked a question there since 2010. I would recommend asking your questions on one of the StackExchange sites (probably StackOverflow). If there's demand for a separate "math and science with python" site, then one could be proposed at http://area51.stackexchange.com/ The lack of activity at ask.scipy could be due to problems with the site (can't sign up, hard to sign in, no search feature), but there might just not be enough interest. The Area51 proposal process could help determine that, assuming the proposal was advertised in appropriate places. On Thu, Feb 9, 2012 at 9:01 AM, Hjalmar Turesson wrote: > I have same problem (invalid captcha) using both Chromium and Firefox 10. > After a half year of trying I still have not been able to log in. > > Best, > Hjalmar > > > On Thu, Feb 9, 2012 at 7:52 AM, Andreas H. wrote: >> >> Hi, >> >> I'm running Chromium 16.0.912.77 (Developer Build 118311 Linux) Ubuntu >> 10.04. When registering for ask.scipy.org, I always get the error message >> "You entered an invalid captcha". However, I don't see any captcha where I >> could enter anything. The ReCaptcha widget is just not shown. >> >> Trying the same with Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.9.168 >> Version/11.52, I do see the ReCaptcha widget. However, when I enter a >> correct captcha and click "Register", I'm running into some sort of >> timeout after about a minute or two. The error message is >> >> ? Proxy Error >> ? The proxy server received an invalid response from an upstream server. >> ? The proxy server could not handle the request POST /login. >> ? Reason: Error reading from remote server >> ? Apache/2.2.3 (CentOS) Server at ask.scipy.org Port 80 >> >> Cheers, >> Andreas. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From robert.kern at gmail.com Thu Feb 9 10:09:45 2012 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 9 Feb 2012 15:09:45 +0000 Subject: [SciPy-User] ask.scipy.org registration not working In-Reply-To: <63b77ef8929b228e5c125438645264c4.squirrel@srv2.s4y.tournesol-consulting.eu> References: <63b77ef8929b228e5c125438645264c4.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: On Thu, Feb 9, 2012 at 12:52, Andreas H. wrote: > Hi, > > I'm running Chromium 16.0.912.77 (Developer Build 118311 Linux) Ubuntu > 10.04. When registering for ask.scipy.org, I always get the error message > "You entered an invalid captcha". However, I don't see any captcha where I > could enter anything. The ReCaptcha widget is just not shown. > > Trying the same with Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.9.168 > Version/11.52, I do see the ReCaptcha widget. However, when I enter a > correct captcha and click "Register", I'm running into some sort of > timeout after about a minute or two. The error message is > > ? Proxy Error > ? The proxy server received an invalid response from an upstream server. > ? The proxy server could not handle the request POST /login. > ? Reason: Error reading from remote server > ? Apache/2.2.3 (CentOS) Server at ask.scipy.org Port 80 Honestly, I'm not sure how to log into the machine anymore to check the configuration. The sysadmin who set the box up has since departed. The site never garnered much interest, and I haven't been able to solve its login problems. I hereby declare ask.scipy.org dead. Please ask your questions on this mailing list or StackOverflow. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From jason-sage at creativetrax.com Thu Feb 9 10:42:42 2012 From: jason-sage at creativetrax.com (Jason Grout) Date: Thu, 09 Feb 2012 09:42:42 -0600 Subject: [SciPy-User] ask.scipy.org registration not working In-Reply-To: References: <63b77ef8929b228e5c125438645264c4.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: <4F33E972.9010301@creativetrax.com> On 2/9/12 8:35 AM, Ken Watford wrote: > Look at the current list of questions. Nobody has asked a question > there since 2010. > > I would recommend asking your questions on one of the StackExchange > sites (probably StackOverflow). > > If there's demand for a separate "math and science with python" site, > then one could be proposed at http://area51.stackexchange.com/ > > The lack of activity at ask.scipy could be due to problems with the > site (can't sign up, hard to sign in, no search feature), but there > might just not be enough interest. The Area51 proposal process could > help determine that, assuming the proposal was advertised in > appropriate places. +1 to a numpy/scipy/matplotlib stackexchange-type site. We've had a lot of success with ask.sagemath.org (running on the askbot [1] software, which is nice since it is open-source and you can host it and modify it if you want). It looks like a python stackexchange site was just proposed: http://area51.stackexchange.com/proposals/38281/python-programmers-hub Thanks, Jason [1] http://askbot.com/ From robert.kern at gmail.com Thu Feb 9 11:08:55 2012 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 9 Feb 2012 16:08:55 +0000 Subject: [SciPy-User] ask.scipy.org registration not working In-Reply-To: <4F33E972.9010301@creativetrax.com> References: <63b77ef8929b228e5c125438645264c4.squirrel@srv2.s4y.tournesol-consulting.eu> <4F33E972.9010301@creativetrax.com> Message-ID: On Thu, Feb 9, 2012 at 15:42, Jason Grout wrote: > It looks like a python stackexchange site was just proposed: > http://area51.stackexchange.com/proposals/38281/python-programmers-hub And immediately shot down. StackOverflow wants all programming questions on StackOverflow. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From jason-sage at creativetrax.com Thu Feb 9 11:38:37 2012 From: jason-sage at creativetrax.com (Jason Grout) Date: Thu, 09 Feb 2012 10:38:37 -0600 Subject: [SciPy-User] ask.scipy.org registration not working In-Reply-To: References: <63b77ef8929b228e5c125438645264c4.squirrel@srv2.s4y.tournesol-consulting.eu> <4F33E972.9010301@creativetrax.com> Message-ID: <4F33F68D.9000304@creativetrax.com> On 2/9/12 10:08 AM, Robert Kern wrote: > On Thu, Feb 9, 2012 at 15:42, Jason Grout wrote: > >> It looks like a python stackexchange site was just proposed: >> http://area51.stackexchange.com/proposals/38281/python-programmers-hub > > And immediately shot down. StackOverflow wants all programming > questions on StackOverflow. > However, there is a mathematica stackexchange proposal: http://area51.stackexchange.com/proposals/37304/mathematica So it seems that they might be open to sites that are dedicated to using specific software. Or is it just as good for people to tag posts with numpy, scipy, or matplotlib on the stackoverflow site? That seems reasonable too. Already there are 1313 matplotlib questions (7 today), 2062 numpy questions (5 asked today), 703 scipy questions (10 asked this week), and 275 ipython questions. It is easy to "follow" these questions too---just search for the tag in the tags area, click on the tag, and click "subscribe". At least, I think that's how it works. It looks like lots of people here might already be following those areas. Thanks, Jason From eirik.gjerlow at astro.uio.no Thu Feb 9 04:54:05 2012 From: eirik.gjerlow at astro.uio.no (=?ISO-8859-1?Q?Eirik_Gjerl=F8w?=) Date: Thu, 09 Feb 2012 10:54:05 +0100 Subject: [SciPy-User] Numpy array slicing Message-ID: <4F3397BD.1000206@uio.no> Hello, this is (I think) a rather basic question about numpy slicing. I have the following code: In [29]: a.shape Out[29]: (3, 4, 12288, 2) In [30]: mask.shape Out[30]: (3, 12288) In [31]: mask.dtype Out[31]: dtype('bool') In [32]: sum(mask[0]) Out[32]: 12285 In [33]: a[[0] + [slice(None)] + [mask[0]] + [slice(None)]].shape Out[33]: (12285, 4, 2) My question is: Why is not the final shape (4, 12285, 2) instead of (12285, 4, 2)? Eirik Gjerl?w From kwatford at gmail.com Thu Feb 9 11:55:43 2012 From: kwatford at gmail.com (Ken Watford) Date: Thu, 9 Feb 2012 11:55:43 -0500 Subject: [SciPy-User] ask.scipy.org registration not working In-Reply-To: References: <63b77ef8929b228e5c125438645264c4.squirrel@srv2.s4y.tournesol-consulting.eu> <4F33E972.9010301@creativetrax.com> Message-ID: On Thu, Feb 9, 2012 at 11:08 AM, Robert Kern wrote: > On Thu, Feb 9, 2012 at 15:42, Jason Grout wrote: > >> It looks like a python stackexchange site was just proposed: >> http://area51.stackexchange.com/proposals/38281/python-programmers-hub > > And immediately shot down. StackOverflow wants all programming > questions on StackOverflow. To be fair, that proposal was "The same as StackOverflow, but just for Python". I noticed there's a Computational Science stack: http://scicomp.stackexchange.com/ Its "python" tag seems a tad underrepresented, but it's still in the beta phase. It seems to have the same sort of focus as ask.scipy.org, just without the "but just for Python" requirement. Perhaps this one could be useful for scipy questions? From guziy.sasha at gmail.com Thu Feb 9 12:13:19 2012 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Thu, 9 Feb 2012 12:13:19 -0500 Subject: [SciPy-User] Numpy array slicing In-Reply-To: <4F3397BD.1000206@uio.no> References: <4F3397BD.1000206@uio.no> Message-ID: Hi, what will happen if you do: a[[0] + [slice(None)] + [[mask[0]]] + [slice(None)]].shape please, give the code to generate the mask and the array, so we could test our hypotheses. thanks -- Oleksandr Huziy 2012/2/9 Eirik Gjerl?w : > Hello, this is (I think) a rather basic question about numpy slicing. I > have the following code: > > In [29]: a.shape > Out[29]: (3, 4, 12288, 2) > > In [30]: mask.shape > Out[30]: (3, 12288) > > In [31]: mask.dtype > Out[31]: dtype('bool') > > In [32]: sum(mask[0]) > Out[32]: 12285 > > In [33]: a[[0] + [slice(None)] + [mask[0]] + [slice(None)]].shape > Out[33]: (12285, 4, 2) > > My question is: Why is not the final shape (4, 12285, 2) instead of > (12285, 4, 2)? > > Eirik Gjerl?w > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From benjamin.hause at colorado.edu Thu Feb 9 12:37:32 2012 From: benjamin.hause at colorado.edu (Benjamin) Date: Thu, 9 Feb 2012 17:37:32 +0000 (UTC) Subject: [SciPy-User] [F2py] basic help Message-ID: I have a fortran 77 code (which I did not write, but currently runs fine) and I would like to create a 'front end' for it with python. As a first step, I would like to simply be able to call this fortran code from python (ideally it would just be one call and I pass in the input as paramaters). Based on the user guide (http://cens.ioc.ee/projects/f2py2e/usersguide/) section 2.1 'the quick way' I have entered the command: f2py -c *.f -m slab but I get errors. The output is really long so I won't post it here, but it looks like it does fine until it tries to compile the code. Some errors/warnings are at the bottom of this post. Note that the only changes I made to the main fortran program was making it a subroutine and passing the input as parameters. I was hoping this would be fairly trivial for me, like it was for this guy: http://moo.nac.uci.edu/~hjm/fd_rrt1d/index.html Any advice is appreciated. Ben 446 if (biz.gt.3.75) go to 148 1 Warning: Label 446 at (1) defined but not used gfortran:f77: slab.f Warning: Nonconforming tab character in column 1 of line 221 slab.h:13.24: Included at slab.f:14: parameter(mm=vm*im*jm) 1 Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1) slab.h:44.38: Included at slab.f:14: common/cc/xg,yg,vg,dx,dy,lx,ly, 1 Error: COMMON attribute conflicts with DUMMY attribute in 'lx' at (1) From pav at iki.fi Thu Feb 9 13:24:54 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 09 Feb 2012 19:24:54 +0100 Subject: [SciPy-User] [F2py] basic help In-Reply-To: References: Message-ID: 09.02.2012 18:37, Benjamin kirjoitti: > I have a fortran 77 code (which I did not write, but currently runs fine) and I > would like to create a 'front end' for it with python. As a first step, I would > like to simply be able to call this fortran code from python (ideally it would > just be one call and I pass in the input as paramaters). > > Based on the user guide (http://cens.ioc.ee/projects/f2py2e/usersguide/) section > 2.1 'the quick way' I have entered the command: > f2py -c *.f -m slab The errors you get seem like the fortran files themselves don't compile. If so, it's not a f2py-related issue. Check that you can compile the Fortran source files separately: gfortran -o slab.o slab.f From wesmckinn at gmail.com Thu Feb 9 18:13:29 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 9 Feb 2012 18:13:29 -0500 Subject: [SciPy-User] ANN: pandas 0.7.0 release Message-ID: Dear all, I'm extremely pleased (and relieved!) to announce the release of pandas 0.7.0! This is the largest single release in the last year, spanning 210 GitHub issues and pull requests with 563 commits from 17 unique authors. It brings with it a wealth of new functionality, performance improvements, bug fixes, and a handful of very minor API changes. I recommend that all users upgrade to the new release as soon as you can. Here are a few highlights of the release: * Completely revamped, high-performance merge/join infrastructure. Full support for all SQL-style joins. Fastest open source implementation I am aware of. * New unified concat function for easily concatenating pandas objects * Better pivot table and cross-tabulation functionality * Numerous new Series and DataFrame instance methods * Substantially improved performance of GroupBy operations * Excel 2007 read/write support * Much better unicode handling on both Python 2 and 3 * Improved console DataFrame formatting * More than 70 bug fixes * Numerous other performance and infrastructural improvements This release also coincides with the creation of a new tool, vbench (https://github.com/wesm/vbench), for systematically monitoring the performance of Python code (in this case pandas) over time. There are now 57 vbenchmarks being tracked with more added all the time (http://pandas.pydata.org/pandas-docs/vbench/). This will help ensure that pandas remains a high performance library in addition to being robust and stable for production application development. pandas has a new project front page at http://pandas.pydata.org. The main repository has also been moved to the newly created PyData organization on GitHub (http://github.com/pydata). Windows binaries are available on PyPI, and .deb binaries will be available in Debian sid and NeuroDebian soon thanks to Yaroslav Halchenko! See the "What's New" page and full release notes for now. Thanks to everyone who contributed to the release! Tons more planned for pandas in 2012 on the road toward a 1.0 release. Looking forward to working with the community to make the library even better! Best, Wes What is it ========== pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with ?relational? or ?labeled? data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Links ===== Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst Documentation: http://pandas.pydata.org Installers: http://pypi.python.org/pypi/pandas Code Repository: http://github.com/pydata/pandas Mailing List: http://groups.google.com/group/pystatsmodels Blog: http://blog.wesmckinney.com From alec.kalinin at gmail.com Fri Feb 10 04:27:32 2012 From: alec.kalinin at gmail.com (Alexander Kalinin) Date: Fri, 10 Feb 2012 12:27:32 +0300 Subject: [SciPy-User] Element-wise multiplication performance: why operations on sliced matrices are faster? Message-ID: I found an interesting fact about element-wise matrix multiplication performance. I expected that full vectorized NumPy operations should always be faster then the loops. But it is not always true. Look at the code: import numpy as np import time def calc(P): for i in range(30): P2 = P * P # full matrix N = 4000 P = np.random.rand(N, N) t0 = time.time() calc(P) t1 = time.time() print " full matrix {:.5f} seconds".format(t1 - t0) # sliced matrix N = 2000 P = np.random.rand(N, N) t0 = time.time() for i in range(4): calc(P) t1 = time.time() print " sliced matrix {:.5f} seconds".format(t1 - t0) The results are: full matrix 2.60245 seconds sliced matrix 1.49381 seconds I continue study of this case and found that the performance depends on matrix size. Look at the attached plot. The x-axis is the dimension of matrices, the y-axis is the execution time. Red line are the full matrix executions times, blue line are the sliced matrix execution times. The plot shows that the 2000 is the critical dimension that cause performance degradation step. Could you, please, explain me this fact? My configuration: OS: Ubuntu 11.10 (oneiric) CPU: Intel(R) Core(TM) i5 CPU M 480 @ 2.67GHz CPUs: 4 Memory: 3746 MiB L2 cache: 3072 KB -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: fig1.png Type: image/png Size: 35984 bytes Desc: not available URL: From matthieu.brucher at gmail.com Fri Feb 10 04:45:04 2012 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 10 Feb 2012 10:45:04 +0100 Subject: [SciPy-User] Element-wise multiplication performance: why operations on sliced matrices are faster? In-Reply-To: References: Message-ID: Hi, It is not vectrorizd per se. It is a C loop, which is faster than Python ones. The full matrix has a jump at half the size of the sliced matrix. My guess is that it is a simple cache effect: with the full matrix, everything fits in cache at 2000, but not at 4000, but for a "sliced" matrix (it is not a real sliced matrix), 4000 means that the "slices" are the same size as the full matrix at 2000. What you see is just the blue line being twice as slow as the red one + the Python for loop overhead. Matthieu 2012/2/10 Alexander Kalinin > I found an interesting fact about element-wise matrix multiplication > performance. I expected that full vectorized NumPy operations should always > be faster then the loops. But it is not always true. Look at the code: > > import numpy as np > > import time > > > def calc(P): > > for i in range(30): > > P2 = P * P > > # full matrix > > N = 4000 > > P = np.random.rand(N, N) > > t0 = time.time() > > calc(P) > > t1 = time.time() > > print " full matrix {:.5f} seconds".format(t1 - t0) > > > # sliced matrix > > N = 2000 > > P = np.random.rand(N, N) > > t0 = time.time() > > for i in range(4): > > calc(P) > > t1 = time.time() > > print " sliced matrix {:.5f} seconds".format(t1 - t0) > > The results are: > full matrix 2.60245 seconds > sliced matrix 1.49381 seconds > > > I continue study of this case and found that the performance depends on > matrix size. Look at the attached plot. The x-axis is the dimension of > matrices, the y-axis is the execution time. Red line are the full matrix > executions times, blue line are the sliced matrix execution times. The plot > shows that the 2000 is the critical dimension that cause performance > degradation step. Could you, please, explain me this fact? > > My configuration: > OS: Ubuntu 11.10 (oneiric) > CPU: Intel(R) Core(TM) i5 CPU M 480 @ 2.67GHz > CPUs: 4 > Memory: 3746 MiB > L2 cache: 3072 KB > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmp50 at ukr.net Fri Feb 10 10:10:46 2012 From: tmp50 at ukr.net (Dmitrey) Date: Fri, 10 Feb 2012 17:10:46 +0200 Subject: [SciPy-User] [ANN] new solver for multiobjective optimization problems Message-ID: <87834.1328886646.5803578716629106688@ffe16.ukr.net> hi, I'm glad to inform you about new Python solver for multiobjective optimization (MOP). Some changes committed to solver interalg made it capable of handling global nonlinear constrained multiobjective problem (MOP), see the page for more details. > > Using interalg you can be 100% sure your result covers whole Pareto front according to the required tolerances on objective functions. > > Available features include real-time or final graphical output, possibility of involving parallel calculations, handling both continuous and discrete variables, export result to xls files. > > Regards, D. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eirik.gjerlow at astro.uio.no Thu Feb 9 12:19:52 2012 From: eirik.gjerlow at astro.uio.no (=?UTF-8?B?RWlyaWsgR2plcmzDuHc=?=) Date: Thu, 09 Feb 2012 18:19:52 +0100 Subject: [SciPy-User] Numpy array slicing In-Reply-To: References: <4F3397BD.1000206@uio.no> Message-ID: <4F340038.5010406@uio.no> Hey, I asked this at numpy-discussion as well, and it seems the issue is that I am mixing advanced indexing and basic slicing, and the way numpy handles this: http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060232.html Anyway, there should be enough information to reproduce the problem. a can be any numpy array with the given shape, so a = np.zeros((3, 4, 12288, 2)) mask = np.zeros((3, 12288), dtype='bool') mask[:] = True mask[0, [3, 7, 9, 15, 20]] = False for instance. Thanks for replying! Eirik On 09. feb. 2012 18:13, Oleksandr Huziy wrote: > Hi, what will happen if you do: > > a[[0] + [slice(None)] + [[mask[0]]] + [slice(None)]].shape > > please, give the code to generate the mask and the array, so we could > test our hypotheses. > thanks > > -- > Oleksandr Huziy > > 2012/2/9 Eirik Gjerl?w: >> Hello, this is (I think) a rather basic question about numpy slicing. I >> have the following code: >> >> In [29]: a.shape >> Out[29]: (3, 4, 12288, 2) >> >> In [30]: mask.shape >> Out[30]: (3, 12288) >> >> In [31]: mask.dtype >> Out[31]: dtype('bool') >> >> In [32]: sum(mask[0]) >> Out[32]: 12285 >> >> In [33]: a[[0] + [slice(None)] + [mask[0]] + [slice(None)]].shape >> Out[33]: (12285, 4, 2) >> >> My question is: Why is not the final shape (4, 12285, 2) instead of >> (12285, 4, 2)? >> >> Eirik Gjerl?w >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From andrew_giessel at hms.harvard.edu Fri Feb 10 22:11:46 2012 From: andrew_giessel at hms.harvard.edu (Andrew Giessel) Date: Fri, 10 Feb 2012 22:11:46 -0500 Subject: [SciPy-User] down-sampling an array by averaging - vectorized form? Message-ID: Hello all, I'm looking to down-sample an image by averaging. I quickly hacked up the following code, which does exactly what I want, but the double loop is slow (the images I'm working with are ~2000x2000 pixels). Is there a nice way to vectorize this? A quick profile showed that most of the time is spend averaging- perhaps there is a way to utilize np.sum or np.cumsum, divide the whole array, and then take every so many pixels? This method of down-sampling (spatial averaging) makes sense for the type of data I'm using and yields good results, but I'm also open to alternatives. Thanks in advance! Andrew ###################### import numpy as np def downsample(array, reduction): """example call for 2fold size reduction: newImage = downsample(image, 2)""" newArray = np.empty(array.shape[0]/reduction, array.shape[1]/reduction) for x in range(newArray.shape[0]): for y in range(newArray.shape[1]): newArray[x,y] = np.mean(array[x*reduction:((x+1)*reduction)-1, y*reduction:((y+1)*reduction)-1]) return newArray ###################### -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.giessel at gmail.com Fri Feb 10 22:19:01 2012 From: andrew.giessel at gmail.com (andrew giessel) Date: Fri, 10 Feb 2012 22:19:01 -0500 Subject: [SciPy-User] down-sampling an array by averaging - vectorized form? In-Reply-To: References: Message-ID: Hello all, I'm looking to down-sample an image by averaging. I quickly hacked up the following code, which does exactly what I want, but the double loop is slow (the images I'm working with are ~2000x2000 pixels). Is there a nice way to vectorize this? A quick profile showed that most of the time is spend averaging- perhaps there is a way to utilize np.sum or np.cumsum, divide the whole array, and then take every so many pixels? This method of down-sampling (spatial averaging) makes sense for the type of data I'm using and yields good results, but I'm also open to alternatives. Thanks in advance! Andrew ###################### import numpy as np def downsample(array, reduction): """example call for 2fold size reduction: newImage = downsample(image, 2)""" newArray = np.empty(array.shape[0]/reduction, array.shape[1]/reduction) for x in range(newArray.shape[0]): for y in range(newArray.shape[1]): newArray[x,y] = np.mean(array[x*reduction:((x+1)*reduction)-1, y*reduction:((y+1)*reduction)-1]) return newArray ###################### -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkhilmer at chemistry.montana.edu Sat Feb 11 01:08:08 2012 From: jkhilmer at chemistry.montana.edu (jkhilmer at chemistry.montana.edu) Date: Fri, 10 Feb 2012 23:08:08 -0700 Subject: [SciPy-User] down-sampling an array by averaging - vectorized form? In-Reply-To: References: Message-ID: Andrew, This is a very naive response, since I don't have nearly the experience as many of the other contributors here. Since you're working with images, and you might want more complicated variants of this process in the future, why not convolve your array and slice the output with a stride/step. For the sizes you're talking about, memory shouldn't be a concern. That should give you a very flexible procedure that is inherently "vectorized". Jonathan On Fri, Feb 10, 2012 at 8:19 PM, andrew giessel wrote: > Hello all, > > I'm looking to down-sample an image by averaging. ?I quickly hacked up the > following code, which does exactly what I want, but the double loop is slow > (the images I'm working with are ~2000x2000 pixels). ?Is there a nice way to > vectorize this? ?A quick profile showed that most of the time is spend > averaging- perhaps there is a way to utilize np.sum or np.cumsum, divide the > whole array, and then take every so many pixels? > > This method of down-sampling (spatial averaging) makes sense for the type of > data I'm using and yields good results, but I'm also open to alternatives. > ?Thanks in advance! > > Andrew > > ###################### > import numpy as np > > def downsample(array, reduction): > ? ? """example call for 2fold size reduction: ?newImage = downsample(image, > 2)""" > > ? ? newArray = np.empty(array.shape[0]/reduction,?array.shape[1]/reduction) > > ? ? for x in range(newArray.shape[0]): > ? ? ? ? for y in range(newArray.shape[1]): > ? ? ? ? ? ? newArray[x,y] = > np.mean(array[x*reduction:((x+1)*reduction)-1,?y*reduction:((y+1)*reduction)-1]) > > ? ? return newArray > ###################### > > -- > Andrew Giessel, PhD > > Department of Neurobiology, Harvard Medical School > 220 Longwood Ave Boston, MA 02115 > ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From Jerome.Kieffer at esrf.fr Sat Feb 11 03:53:46 2012 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Sat, 11 Feb 2012 09:53:46 +0100 Subject: [SciPy-User] down-sampling an array by averaging - vectorized form? In-Reply-To: References: Message-ID: <20120211095346.1c178e5d.Jerome.Kieffer@esrf.fr> Hi, Your needs are very close to "binning" (just devide the result by the number of input pixel in each output): def binning(inputArray, binsize): """ @param inputArray: input ndarray @param binsize: int or 2-tuple representing the size of the binning @return: binned input ndarray """ inputSize = inputArray.shape outputSize = [] assert(len(inputSize) == 2) if isinstance(binsize, int): binsize = (binsize, binsize) for i, j in zip(inputSize, binsize): assert(i % j == 0) outputSize.append(i // j) if numpy.array(binsize).prod() < 50: out = numpy.zeros(tuple(outputSize)) for i in xrange(binsize[0]): for j in xrange(binsize[1]): out += inputArray[i::binsize[0], j::binsize[1]] else: temp = inputArray.copy() temp.shape = (outputSize[0], binsize[0], outputSize[1], binsize[1]) out = temp.sum(axis=3).sum(axis=1) return out This function implements 2 methods: - one faster for small binning based on a loop and a sum of all elements - one for larger binning (8x8 and +) based on a reshape and two sum on two different axis HTH -- J?r?me Kieffer Data analysis unit - ESRF From ralf.gommers at googlemail.com Sat Feb 11 09:11:06 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 11 Feb 2012 15:11:06 +0100 Subject: [SciPy-User] ANN: SciPy 0.10.1 release candidate 1 Message-ID: Hi all, I am pleased to announce the availability of the first release candidate of SciPy 0.10.1. Please try out this release and report any problems on the scipy-dev mailing list. If no problems are found, the final release will be available in one week. Sources and binaries can be found at http://sourceforge.net/projects/scipy/files/scipy/0.10.1rc1/, release notes are copied below. Cheers, Ralf ========================== SciPy 0.10.1 Release Notes ========================== .. contents:: SciPy 0.10.1 is a bug-fix release with no new features compared to 0.10.0. Main changes ------------ The most important changes are:: 1. The single precision routines of ``eigs`` and ``eigsh`` in ``scipy.sparse.linalg`` have been disabled (they internally use double precision now). 2. A compatibility issue related to changes in NumPy macros has been fixed, in order to make scipy 0.10.1 compile with the upcoming numpy 1.7.0 release. Other issues fixed ------------------ - #835: stats: nan propagation in stats.distributions - #1202: io: netcdf segfault - #1531: optimize: make curve_fit work with method as callable. - #1560: linalg: fixed mistake in eig_banded documentation. - #1565: ndimage: bug in ndimage.variance - #1457: ndimage: standard_deviation does not work with sequence of indexes - #1562: cluster: segfault in linkage function - #1568: stats: One-sided fisher_exact() returns `p` < 1 for 0 successful attempts - #1575: stats: zscore and zmap handle the axis keyword incorrectly -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsyu80 at gmail.com Sat Feb 11 10:40:40 2012 From: tsyu80 at gmail.com (Tony Yu) Date: Sat, 11 Feb 2012 10:40:40 -0500 Subject: [SciPy-User] down-sampling an array by averaging - vectorized form? In-Reply-To: References: Message-ID: On Fri, Feb 10, 2012 at 10:11 PM, Andrew Giessel < andrew_giessel at hms.harvard.edu> wrote: > Hello all, > > I'm looking to down-sample an image by averaging. I quickly hacked up the > following code, which does exactly what I want, but the double loop is slow > (the images I'm working with are ~2000x2000 pixels). Is there a nice way > to vectorize this? A quick profile showed that most of the time is spend > averaging- perhaps there is a way to utilize np.sum or np.cumsum, divide > the whole array, and then take every so many pixels? > > This method of down-sampling (spatial averaging) makes sense for the type > of data I'm using and yields good results, but I'm also open to > alternatives. Thanks in advance! > > Andrew > > ###################### > import numpy as np > > def downsample(array, reduction): > """example call for 2fold size reduction: newImage = > downsample(image, 2)""" > > newArray = np.empty(array.shape[0]/reduction, array.shape[1]/reduction) > > for x in range(newArray.shape[0]): > for y in range(newArray.shape[1]): > newArray[x,y] = > np.mean(array[x*reduction:((x+1)*reduction)-1, y*reduction:((y+1)*reduction)-1]) > > return newArray > ###################### > > I think `scipy.ndimage.zoom` does what you want. Or actually, it does the opposite: your 2fold size reduction example would be >>> from scipy import ndimage >>> small_image = ndimage.zoom(image, 0.5) -Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: From emmanuelle.gouillart at normalesup.org Sat Feb 11 10:30:02 2012 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Sat, 11 Feb 2012 16:30:02 +0100 Subject: [SciPy-User] Euroscipy 2012 - Brussels - August 23-37 - call for abstracts Message-ID: <20120211153002.GA3217@phare.normalesup.org> ------------------------------------------------------------- Euroscipy 2012, the 5th European meeting on Python in Science ------------------------------------------------------------- It is our pleasure to announce the conference Euroscipy 2012, that will be held in **Brussels**, **August 23-27**, at the Universit? Libre de Bruxelles (ULB, Solbosch Campus). The EuroSciPy meeting is a cross-disciplinary gathering focused on the use and development of the Python language in scientific research and industry. This event strives to bring together both users and developers of scientific tools, as well as academic research and state of the art industry. Website ======= http://www.euroscipy.org/conference/euroscipy2012 Main topics =========== - Presentations of scientific tools and libraries using the Python language, including but not limited to: - vector and array manipulation - parallel computing - scientific visualization - scientific data flow and persistence - algorithms implemented or exposed in Python - web applications and portals for science and engineering. - Reports on the use of Python in scientific achievements or ongoing projects. - General-purpose Python tools that can be of special interest to the scientific community. Tutorials ========= There will be two tutorial tracks at the conference, an introductory one, to bring up to speed with the Python language as a scientific tool, and an advanced track, during which experts of the field will lecture on specific advanced topics such as advanced use of numpy, paralllel computing, advanced testing... Keynote Speaker: David Beazley ============================== This year, we are very happy to welcome David Beazley (http://www.dabeaz.com) as our keynote speaker. David is the original author of SWIG, a software development tool that connects programs written in C and C++ with a variety of high-level programming languages such as Python. He has also authored the acclaimed Python Essential Reference. Important dates =============== Talk submission deadline: Mon Apr 30, 2012 Program announced: end of May Tutorials tracks: Thursday August 23 - Friday August 24, 2012 Conference track: Saturday August 25 - Sunday August 26, 2012 Satellites: Monday August 27 Satellite meetings are yet to be announced. Call for talks and posters ========================== We are soliciting talks and posters that discuss topics related to scientific computing using Python. These include applications, teaching, future development directions, and research. We welcome contributions from the industry as well as the academic world. Indeed, industrial research and development as well academic research face the challenge of mastering IT tools for exploration, modeling and analysis. We look forward to hearing your recent breakthroughs using Python! Submission guidelines ===================== - We solicit proposals in the form of a **one-page long abstract**. - Submissions whose main purpose is to promote a commercial product or service will be refused. - All accepted proposals must be presented at the EuroSciPy conference by at least one author. Abstracts should be detailed enough for the reviewers to appreciate the interest of the work for a wide audience. Examples of abstracts can be found on last year's webpage www.euroscipy.org/track/3992 (talks tab). The one-page long abstracts are for conference planning and selection purposes only. How to submit an abstract ========================= To submit a talk to the EuroScipy conference follow the instructions here: http://www.euroscipy.org/card/euroscipy2012_call_for_contributions Organizers ========== Chairs: - Pierre de Buyl - Didrik Pinte Local organizing committee - Kael Hanson - Nicolas Pettiaux Program committee - Tiziano Zito (Chair) - Ga?l Varoquaux - Stefan Van Der Walt - Konrad Hinsen - Emmanuelle Gouillart - Mike M?ller - Hans Petter Langtangen - Pierre de Buyl - Kael Hanson Tutorials chair: Valentin Haenel General organizing committee - Communication: Emmanuelle Gouillart - Sponsoring: Mike M?ller. - Web site: Nicolas Chauvat. Still have questions? ===================== send an e-mail to org-team at lists.euroscipy.org 94,1 -- Emmanuelle, for the organizing team From deshpande.jaidev at gmail.com Sat Feb 11 11:15:28 2012 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Sat, 11 Feb 2012 21:45:28 +0530 Subject: [SciPy-User] Euroscipy 2012 - Brussels - August 23-37 - call for abstracts In-Reply-To: <20120211153002.GA3217@phare.normalesup.org> References: <20120211153002.GA3217@phare.normalesup.org> Message-ID: Hi, Glad to see this. Any chance of travel grants? Thanks From andrew_giessel at hms.harvard.edu Sat Feb 11 14:23:32 2012 From: andrew_giessel at hms.harvard.edu (Andrew Giessel) Date: Sat, 11 Feb 2012 14:23:32 -0500 Subject: [SciPy-User] down-sampling an array by averaging - vectorized form? In-Reply-To: References: Message-ID: I'd like to thank everyone for their responses- they were really helpful in thinking about the problem. All the solutions people posted were faster than my original brutish algorithm, but all had subtle differences as well. None of them are 'vectorized' persay but all are more clever or effeicent ways of getting at the same problem. I thought I'd write a couple of quick comments. Convolution with a kernel is a good idea (one that I should have thought of since I've been working with various types of filtering recently) and works quite quickly. Depending on the type and size of the kernel it yields slightly different local averages at each point which can be sub-sampled to yield a smaller array. I didn't play with too many types of kernels but I'd say it's important to consider the nature of the kernel and the relation of the subsampling frequency to the size of the kernel in order to get the best results. The binning followed by scalar division is wicked fast and yields results that are very close to my original algorithm. The reshaping seems very clever and I am going to read it more carefully to learn some lessons there, I think. The ndimage.zoom approach is a very general approach (and roughly as quick as the others). As far as I can tell, that function uses spline interpolation for zoom factors > 1, and I'm unsure how it deals with zoom factors < 1. It might do nearest neighbor or something like that, I wasn't able to quickly determine from glancing at the source. If anyone knows, it would be cool to hear. I think I'll probably go with binning for now- We'll be dealing with hundreds of wide-field microscopy images (of this rough size) in every experiment and speed is a factor. Thanks again everyone! Best, Andrew On Sat, Feb 11, 2012 at 10:40, Tony Yu wrote: > > > On Fri, Feb 10, 2012 at 10:11 PM, Andrew Giessel < > andrew_giessel at hms.harvard.edu> wrote: > >> Hello all, >> >> I'm looking to down-sample an image by averaging. I quickly hacked up >> the following code, which does exactly what I want, but the double loop is >> slow (the images I'm working with are ~2000x2000 pixels). Is there a nice >> way to vectorize this? A quick profile showed that most of the time is >> spend averaging- perhaps there is a way to utilize np.sum or np.cumsum, >> divide the whole array, and then take every so many pixels? >> >> This method of down-sampling (spatial averaging) makes sense for the type >> of data I'm using and yields good results, but I'm also open to >> alternatives. Thanks in advance! >> >> Andrew >> >> ###################### >> import numpy as np >> >> def downsample(array, reduction): >> """example call for 2fold size reduction: newImage = >> downsample(image, 2)""" >> >> newArray = >> np.empty(array.shape[0]/reduction, array.shape[1]/reduction) >> >> for x in range(newArray.shape[0]): >> for y in range(newArray.shape[1]): >> newArray[x,y] = >> np.mean(array[x*reduction:((x+1)*reduction)-1, y*reduction:((y+1)*reduction)-1]) >> >> return newArray >> ###################### >> >> > I think `scipy.ndimage.zoom` does what you want. Or actually, it does the > opposite: your 2fold size reduction example would be > > >>> from scipy import ndimage > >>> small_image = ndimage.zoom(image, 0.5) > > -Tony > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Sat Feb 11 14:56:54 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Sat, 11 Feb 2012 14:56:54 -0500 Subject: [SciPy-User] down-sampling an array by averaging - vectorized form? In-Reply-To: References: Message-ID: Hi Andrew, > None of them are 'vectorized' persay but all are more clever or effeicent ways of getting at the same problem. I thought I'd write a couple of quick comments. It depends what you mean by "vectorized" -- none are using SIMD instructions on the chip, but from the Matlab/numpy perspective I think people often mean "vectorized" as "multiple data elements acted on by a single command" such as C = A + B, where A and B are matrices. In any case, the reshaping approach is "vectorized" according to the latter definition, which obviously really just means "the loops are pushed down into C"... > The binning followed by scalar division is wicked fast and yields results that are very close to my original algorithm. The reshaping seems very clever and I am going to read it more carefully to learn some lessons there, I think. For more information on reshaping to do decimation, see the "avoiding loops when downsampling arrays" thread on the numpy-discussion list from last week: https://groups.google.com/forum/#!topic/numpy/qyDKJTj5jx4 There's a bit more discussion about how this works, and some memory-layout caveats to be aware of. Also, instead of doing the reshaping, you could see if hard-coding the averaging is faster. Here's how to do it for the 2x2 case: B = (A[::2,::2] + A[1::2,::2] + A[::2,1::2] + A[1::2,::2])/4.0 > The ndimage.zoom approach is a very general approach (and roughly as quick as the others). As far as I can tell, that function uses spline interpolation for zoom factors > 1, and I'm unsure how it deals with zoom factors < 1. It might do nearest neighbor or something like that, I wasn't able to quickly determine from glancing at the source. If anyone knows, it would be cool to hear. I'm pretty certain that the zoom function doesn't do anything smart for image decimation/minification -- it just uses the requested interpolation order to take a point-sample of the image at the calculated coordinates. Lack of good decimation is a limitation of ndimage. I know that there are decimation routines in scipy.signal, but I'm not sure if they're just 1D. In general, for integer-factor downsampling, I either do it with slicing like the above example, or use convolution (like ndimage.gaussian_filter with the appropriate bandwidth, which is quite fast) to prefilter followed by taking a view such as A[::3,::3] to downsample by a factor of three. Zach From andrew_giessel at hms.harvard.edu Sat Feb 11 15:14:54 2012 From: andrew_giessel at hms.harvard.edu (Andrew Giessel) Date: Sat, 11 Feb 2012 15:14:54 -0500 Subject: [SciPy-User] down-sampling an array by averaging - vectorized form? In-Reply-To: References: Message-ID: Heya Zach- On Sat, Feb 11, 2012 at 14:56, Zachary Pincus wrote: > Hi Andrew, > > > None of them are 'vectorized' persay but all are more clever or > effeicent ways of getting at the same problem. I thought I'd write a > couple of quick comments. > > It depends what you mean by "vectorized" -- none are using SIMD > instructions on the chip, but from the Matlab/numpy perspective I think > people often mean "vectorized" as "multiple data elements acted on by a > single command" such as C = A + B, where A and B are matrices. > > In any case, the reshaping approach is "vectorized" according to the > latter definition, which obviously really just means "the loops are pushed > down into C"... Yes, I guess that's more precisely what I mean- I would call the reshaping approach vectorized as well. > > The binning followed by scalar division is wicked fast and yields > results that are very close to my original algorithm. The reshaping seems > very clever and I am going to read it more carefully to learn some lessons > there, I think. > > For more information on reshaping to do decimation, see the "avoiding > loops when downsampling arrays" thread on the numpy-discussion list from > last week: > https://groups.google.com/forum/#!topic/numpy/qyDKJTj5jx4 > > There's a bit more discussion about how this works, and some memory-layout > caveats to be aware of. > > Also, instead of doing the reshaping, you could see if hard-coding the > averaging is faster. Here's how to do it for the 2x2 case: > B = (A[::2,::2] + A[1::2,::2] + A[::2,1::2] + A[1::2,::2])/4.0 Wow, last week? Guess that's what I get for not searching archives. I just joined both numpy-discussion and scipy-user this week. I will check that thread. > The ndimage.zoom approach is a very general approach (and roughly as > quick as the others). As far as I can tell, that function uses spline > interpolation for zoom factors > 1, and I'm unsure how it deals with zoom > factors < 1. It might do nearest neighbor or something like that, I wasn't > able to quickly determine from glancing at the source. If anyone knows, it > would be cool to hear. > > I'm pretty certain that the zoom function doesn't do anything smart for > image decimation/minification -- it just uses the requested interpolation > order to take a point-sample of the image at the calculated coordinates. > Lack of good decimation is a limitation of ndimage. I know that there are > decimation routines in scipy.signal, but I'm not sure if they're just 1D. > > In general, for integer-factor downsampling, I either do it with slicing > like the above example, or use convolution (like ndimage.gaussian_filter > with the appropriate bandwidth, which is quite fast) to prefilter followed > by taking a view such as A[::3,::3] to downsample by a factor of three. > Cheers, it's great to know what other people do. ag -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From k-assem84 at hotmail.com Fri Feb 10 11:11:29 2012 From: k-assem84 at hotmail.com (suzana8447) Date: Fri, 10 Feb 2012 08:11:29 -0800 (PST) Subject: [SciPy-User] [SciPy-user] Covariance matrix Message-ID: <33301423.post@talk.nabble.com> Hello every body, I am using least square fit to fit some function to a given data. The fit is perfect with leastsq. Now, I need to calculate the covariance matrix whereby the diagonal terms represent the variances for the parameters. I need to know, if possible, how to extract the covariance matrix from leastsq. If there is no way to extract it, Are there any good methods that can be used to calculate the covariance matrix with high precision? Thanks in advance. -- View this message in context: http://old.nabble.com/Covariance-matrix-tp33301423p33301423.html Sent from the Scipy-User mailing list archive at Nabble.com. From charlesr.harris at gmail.com Sat Feb 11 21:47:26 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 11 Feb 2012 19:47:26 -0700 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: <33301423.post@talk.nabble.com> References: <33301423.post@talk.nabble.com> Message-ID: On Fri, Feb 10, 2012 at 9:11 AM, suzana8447 wrote: > > Hello every body, > I am using least square fit to fit some function to a given data. The fit > is > perfect with leastsq. Now, I need to calculate the covariance matrix > whereby > the diagonal terms represent the variances for the parameters. > > I need to know, if possible, how to extract the covariance matrix from > leastsq. If there is no way to extract it, Are there any good methods that > can be used to calculate the covariance matrix with high precision? > If you pass the optional argument full_output=1 when calling leastsq the (scaled) covariance matrix will be returned in the slot after the solution. It needs to be multiplied by an estimated measurement variance determined from the residuals or by some other method. The documentation isn't quite right on that score, it says standard deviation. The computation of the covariance probably isn't the best numerically as its triangular factors are multiplied before inversion, rather than vice-verse. Patches welcome ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From slasley at space.umd.edu Sun Feb 12 16:02:57 2012 From: slasley at space.umd.edu (Scott Lasley) Date: Sun, 12 Feb 2012 16:02:57 -0500 Subject: [SciPy-User] ANN: SciPy 0.10.1 release candidate 1 Message-ID: I downloaded scipy-0.10.1rc1.tar.gz from sourceforge and installed it on a MacPro running OS X 10.7.3, XCode 4.2.1, gfortran from the R Project, numpy 1.6.1 from sourceforge, and python 2.7.2 from python.org with these commands export MACOSX_DEPLOYMENT_TARGET=10.7 export CC=/usr/bin/gcc python setupegg.py build --fcompiler=gfortran python setupegg.py bdist_egg unset MACOSX_DEPLOYMENT_TARGET unset CC easy_install -U dist/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg scipy crashes python and python-32 in TestDoubleIFFT. I get similar crashes with scipy 2.0.0.dev-8d1b91e and numpy 0.11.0.dev-912768f from github. $ python Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import scipy >>> scipy.test(verbose=10) Running unit tests for scipy NumPy version 1.6.1 NumPy is installed in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy SciPy version 0.10.1rc1 SciPy is installed in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] nose version 1.1.2 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$'] nose.config: INFO: Excluding tests matching ['f2py_ext', 'f2py_f90_ext', 'gen_ext', 'pyrex_ext', 'swig_ext'] Tests cophenet(Z) on tdist data set. ... ok ... Check that updating stored values with exact ones worked. ... ok test_constants.test_fahrenheit_to_celcius ... ok test_constants.test_celcius_to_kelvin ... ok test_constants.test_kelvin_to_celcius ... ok test_constants.test_fahrenheit_to_kelvin ... ok test_constants.test_kelvin_to_fahrenheit ... ok test_constants.test_celcius_to_fahrenheit ... ok test_constants.test_lambda_to_nu ... ok test_constants.test_nu_to_lambda ... ok test_definition (test_basic.TestDoubleFFT) ... ok test_djbfft (test_basic.TestDoubleFFT) ... ok test_n_argument_real (test_basic.TestDoubleFFT) ... ok test_definition (test_basic.TestDoubleIFFT) ... FAIL test_definition_real (test_basic.TestDoubleIFFT) ... ok test_djbfft (test_basic.TestDoubleIFFT) ... FAIL test_random_complex (test_basic.TestDoubleIFFT) ... FAIL Python(68720) malloc: *** error for object 0x105312530: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug Abort trap: 6 $ $ python-32 Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import scipy >>> scipy.test(verbose=10) Running unit tests for scipy NumPy version 1.6.1 NumPy is installed in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy SciPy version 0.10.1rc1 SciPy is installed in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] nose version 1.1.2 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$'] nose.config: INFO: Excluding tests matching ['f2py_ext', 'f2py_f90_ext', 'gen_ext', 'pyrex_ext', 'swig_ext'] ... test_definition (test_basic.TestDoubleFFT) ... ok test_djbfft (test_basic.TestDoubleFFT) ... ok test_n_argument_real (test_basic.TestDoubleFFT) ... ok test_definition (test_basic.TestDoubleIFFT) ... Python(68743) malloc: *** error for object 0x3a33604: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug FAIL test_definition_real (test_basic.TestDoubleIFFT) ... ok test_djbfft (test_basic.TestDoubleIFFT) ... Python(68743) malloc: *** error for object 0x3a2e2c4: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug FAIL test_random_complex (test_basic.TestDoubleIFFT) ... FAIL test_random_real (test_basic.TestDoubleIFFT) ... Python(68743) malloc: *** error for object 0x354c604: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug FAIL test_size_accuracy (test_basic.TestDoubleIFFT) ... Bus error: 10 $ From ralf.gommers at googlemail.com Mon Feb 13 01:02:20 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 13 Feb 2012 07:02:20 +0100 Subject: [SciPy-User] ANN: SciPy 0.10.1 release candidate 1 In-Reply-To: References: Message-ID: On Sun, Feb 12, 2012 at 10:02 PM, Scott Lasley wrote: > I downloaded scipy-0.10.1rc1.tar.gz from sourceforge and installed it on a > MacPro running OS X 10.7.3, XCode 4.2.1, gfortran from the R Project, numpy > 1.6.1 from sourceforge, and python 2.7.2 from python.org with these > commands > > export MACOSX_DEPLOYMENT_TARGET=10.7 > export CC=/usr/bin/gcc > python setupegg.py build --fcompiler=gfortran > python setupegg.py bdist_egg > unset MACOSX_DEPLOYMENT_TARGET > unset CC > easy_install -U dist/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg > > scipy crashes python and python-32 in TestDoubleIFFT. I get similar > crashes with scipy 2.0.0.dev-8d1b91e and numpy 0.11.0.dev-912768f from > github. > This looks like http://projects.scipy.org/scipy/ticket/1496, using 0.10.0 with the above build recipe should crash in the same way. Can you comment on that ticket, double check that you're using non-LLVM gcc (looks unrelated, but you forgot to export CXX) and the correct fortran compiler (there are two from R-project, the one linked on the front page is the wrong one last time I checked), and try the recipe linked to there? OS X Lion build is still very painful, if someone wants to have a go at making numpy/scipy work with llvm-gcc that would be very helpful. Thanks, Ralf > $ python > Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import scipy > >>> scipy.test(verbose=10) > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy > SciPy version 0.10.1rc1 > SciPy is installed in > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy > Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC > 4.2.1 (Apple Inc. build 5666) (dot 3)] > nose version 1.1.2 > nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$'] > nose.config: INFO: Excluding tests matching ['f2py_ext', 'f2py_f90_ext', > 'gen_ext', 'pyrex_ext', 'swig_ext'] > Tests cophenet(Z) on tdist data set. ... ok > ... > Check that updating stored values with exact ones worked. ... ok > test_constants.test_fahrenheit_to_celcius ... ok > test_constants.test_celcius_to_kelvin ... ok > test_constants.test_kelvin_to_celcius ... ok > test_constants.test_fahrenheit_to_kelvin ... ok > test_constants.test_kelvin_to_fahrenheit ... ok > test_constants.test_celcius_to_fahrenheit ... ok > test_constants.test_lambda_to_nu ... ok > test_constants.test_nu_to_lambda ... ok > test_definition (test_basic.TestDoubleFFT) ... ok > test_djbfft (test_basic.TestDoubleFFT) ... ok > test_n_argument_real (test_basic.TestDoubleFFT) ... ok > test_definition (test_basic.TestDoubleIFFT) ... FAIL > test_definition_real (test_basic.TestDoubleIFFT) ... ok > test_djbfft (test_basic.TestDoubleIFFT) ... FAIL > test_random_complex (test_basic.TestDoubleIFFT) ... FAIL > Python(68720) malloc: *** error for object 0x105312530: pointer being > freed was not allocated > *** set a breakpoint in malloc_error_break to debug > Abort trap: 6 > $ > > $ python-32 > Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import scipy > >>> scipy.test(verbose=10) > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy > SciPy version 0.10.1rc1 > SciPy is installed in > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy > Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC > 4.2.1 (Apple Inc. build 5666) (dot 3)] > nose version 1.1.2 > nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$'] > nose.config: INFO: Excluding tests matching ['f2py_ext', 'f2py_f90_ext', > 'gen_ext', 'pyrex_ext', 'swig_ext'] > ... > test_definition (test_basic.TestDoubleFFT) ... ok > test_djbfft (test_basic.TestDoubleFFT) ... ok > test_n_argument_real (test_basic.TestDoubleFFT) ... ok > test_definition (test_basic.TestDoubleIFFT) ... Python(68743) malloc: *** > error for object 0x3a33604: incorrect checksum for freed object - object > was probably modified after being freed. > *** set a breakpoint in malloc_error_break to debug > FAIL > test_definition_real (test_basic.TestDoubleIFFT) ... ok > test_djbfft (test_basic.TestDoubleIFFT) ... Python(68743) malloc: *** > error for object 0x3a2e2c4: incorrect checksum for freed object - object > was probably modified after being freed. > *** set a breakpoint in malloc_error_break to debug > FAIL > test_random_complex (test_basic.TestDoubleIFFT) ... FAIL > test_random_real (test_basic.TestDoubleIFFT) ... Python(68743) malloc: *** > error for object 0x354c604: incorrect checksum for freed object - object > was probably modified after being freed. > *** set a breakpoint in malloc_error_break to debug > FAIL > test_size_accuracy (test_basic.TestDoubleIFFT) ... Bus error: 10 > $ > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eckjoh2 at web.de Mon Feb 13 03:59:20 2012 From: eckjoh2 at web.de (Johannes Eckstein) Date: Mon, 13 Feb 2012 09:59:20 +0100 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: References: <33301423.post@talk.nabble.com> Message-ID: <4F38D0E8.4000005@web.de> Hi All, I have an additional question to that below. how can I do a matrix multiplication of a matrix X with shape: (240001, 4) M = X * X.H when I do this I get the following: return N.dot(self, asmatrix(other)) ValueError: array is too big. What is the best way to avoid this error? Thanks in advance, Johannes > > > On Fri, Feb 10, 2012 at 9:11 AM, suzana8447 > wrote: > > > Hello every body, > I am using least square fit to fit some function to a given data. > The fit is > perfect with leastsq. Now, I need to calculate the covariance > matrix whereby > the diagonal terms represent the variances for the parameters. > > I need to know, if possible, how to extract the covariance matrix from > leastsq. If there is no way to extract it, Are there any good > methods that > can be used to calculate the covariance matrix with high precision? > > > If you pass the optional argument full_output=1 when calling leastsq > the (scaled) covariance matrix will be returned in the slot after the > solution. It needs to be multiplied by an estimated measurement > variance determined from the residuals or by some other method. The > documentation isn't quite right on that score, it says standard > deviation. The computation of the covariance probably isn't the best > numerically as its triangular factors are multiplied before inversion, > rather than vice-verse. Patches welcome ;) > > Chuck > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgarcia at olfac.univ-lyon1.fr Mon Feb 13 04:09:11 2012 From: sgarcia at olfac.univ-lyon1.fr (Samuel Garcia) Date: Mon, 13 Feb 2012 10:09:11 +0100 Subject: [SciPy-User] [ANN] release of Neo 0.2.0 Message-ID: <4F38D337.7000703@olfac.univ-lyon1.fr> Dear scipy list, We are proud to announce the 0.2.0 release ofNeo , a Python library for working with electrophysiology data, whether from biological experiments or from simulations. Neo is a package for representing electrophysiology data in Python, together with support for reading a wide range of neurophysiology file formats, including Spike2, NeuroExplorer, AlphaOmega, Axon, Blackrock, Plexon, Tdt, and support for writing to a subset of these formats plus non-proprietary formats including HDF5. The goal of Neo is to improve interoperability between Python tools for analyzing, visualizing and generating electrophysiology data by providing a common, shared object model. In order to be as lightweight a dependency as possible, Neo is deliberately limited to represention of data, with no functions for data analysis or visualization. Neo implements a hierarchical data model well adapted to intracellular and extracellular electrophysiology and EEG data with support for multi-electrodes (for example tetrodes). Neo's data objects build on thequantities package, which in turn builds on NumPy by adding support for physical dimensions. Thus Neo objects behave just like normal NumPy arrays, but with additional metadata, checks for dimensional consistency and automatic unit conversion. Documentation: http://packages.python.org/neo/ Licence: Modified BSD Download: fromPyPI or from theINCF Software Center Samuel Garcia -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Samuel Garcia Lyon Neuroscience CNRS - UMR5292 - INSERM U1028 - Universite Claude Bernard LYON 1 Equipe R et D 50, avenue Tony Garnier 69366 LYON Cedex 07 FRANCE T?l : 04 37 28 74 24 Fax : 04 37 28 76 01 http://olfac.univ-lyon1.fr/unite/equipe-07/ http://neuralensemble.org/trac/OpenElectrophy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Feb 13 06:33:21 2012 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 13 Feb 2012 12:33:21 +0100 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: <4F38D0E8.4000005@web.de> References: <33301423.post@talk.nabble.com> <4F38D0E8.4000005@web.de> Message-ID: 13.02.2012 09:59, Johannes Eckstein kirjoitti: [clip] > how can I do a matrix multiplication of a matrix X with shape: (240001, 4) > > M = X * X.H > > when I do this I get the following: > return N.dot(self, asmatrix(other)) > ValueError: array is too big. > > What is the best way to avoid this error? The result would be a 240001 x 240001 matrix that consumes 430 GB of memory. Do you really have that much available? -- Pauli Virtanen From Dieter.Werthmuller at ed.ac.uk Mon Feb 13 08:34:46 2012 From: Dieter.Werthmuller at ed.ac.uk (=?ISO-8859-1?Q?Dieter_Werthm=FCller?=) Date: Mon, 13 Feb 2012 13:34:46 +0000 Subject: [SciPy-User] [SciPy-user] adaptive simulated annealing (ASA) Message-ID: <4F391176.6080906@ed.ac.uk> Hi there, The message 'adaptive simulated annealing (ASA)' is over four years old now, asking for a python implementation of ASA (http://www.ingber.com/#ASA). I was wondering if such an implementation is around today? Kind regards, Dieter -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From slasley at space.umd.edu Mon Feb 13 10:26:38 2012 From: slasley at space.umd.edu (Scott Lasley) Date: Mon, 13 Feb 2012 10:26:38 -0500 Subject: [SciPy-User] ANN: SciPy 0.10.1 release candidate 1 In-Reply-To: References: Message-ID: <3889085B-59E8-4324-8021-1E4992DB571E@space.umd.edu> Thanks for the help. /usr/bin/gcc -> llvm-gcc-4.2 so it appears to be the llvm-gcc issue from ticket 1496. I installed http://r.research.att.com/tools/gcc-42-5666.3-darwin11.pkg to be sure I had the correct gcc-4.2 and gfortran for XCode 4.2.1. I was not able to make a universal version of scipy with gcc-4.2, but I could build scipy-0.10.1rc1 after specifying the -arch CFLAG. scipy failed 7 tests from test_arpack.test_symmetric_modes export MACOSX_DEPLOYMENT_TARGET=10.7 export CC=/usr/bin/gcc-4.2 export CXX=/usr/bin/g++-4.2 export CFLAGS="-arch x86_64" export FFLAGS=-ff2c python setupegg.py build --fcompiler=gfortran python setupegg.py bdist_egg $ export CFLAGS="-arch x86_64" $ python >>> import scipy >>> scipy.test() Running unit tests for scipy NumPy version 1.6.1 NumPy is installed in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy SciPy version 0.10.1rc1 SciPy is installed in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] nose version 1.1.2 ............................................................................................................................................................................................................................K............................................................................................................/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/interpolate/fitpack2.py:674: UserWarning: The coefficients of the spline returned have been computed as the minimal norm least-squares solution of a (numerically) rank deficient system (deficiency=7). If deficiency is large, the results may be inaccurate. Deficiency may strongly depend on the value of eps. warnings.warn(message) ....../Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/interpolate/fitpack2.py:605: UserWarning: The required storage space exceeds the available storage space: nxest or nyest too small, or s too small. The weighted least-squares spline corresponds to the current set of knots. warnings.warn(message) ........................K..K....../Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/core/numeric.py:1920: RuntimeWarning: invalid value encountered in absolute return all(less_equal(absolute(x-y), atol + rtol * absolute(y))) ............................................................................................................................................................................................................................................................................................................................................................................................................................................./Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/io/wavfile.py:31: WavFileWarning: Unfamiliar format bytes warnings.warn("Unfamiliar format bytes", WavFileWarning) /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/io/wavfile.py:121: WavFileWarning: chunk not understood warnings.warn("chunk not understood", WavFileWarning) ...............................................................................................................................................................................................................................SSSSSS......SSSSSS......SSSS...............................................................................S............................................................................................................................................................................................................................................................K......................................................................................................................................................................................................SSSSS............S..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................SSSSSSSSSSS.........../Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/sparse/linalg/eigen/arpack/arpack.py:63: UserWarning: Single-precision types in `eigs` and `eighs` are not supported currently. Double precision routines are used instead. warnings.warn("Single-precision types in `eigs` and `eighs` " ....F.F.....................F...........F.F..............................................................................................F........................F.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K...............................................................K...........................................................................................................................................................KK.............................................................................................................................................................................................................................................................................................................................................................................................................................................K.K.............................................................................................................................................................................................................................................................................................................................................................................................K........K..............SSSSSSS..........................................................................................................................................................S.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'normal') ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 235, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 1168, in assert_allclose verbose=verbose, header=header) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=aslinearoperator, OPpart=None, mode=normal (mismatch 100.0%) x: array([[ 0.23815642, 0.1763755 ], [-0.10785346, -0.32103487], [ 0.12468303, -0.11230416],... y: array([[ 0.23815642, 0.24814051], [-0.10785347, -0.15634772], [ 0.12468302, 0.05671416],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'cayley') ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 235, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 1168, in assert_allclose verbose=verbose, header=header) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=aslinearoperator, OPpart=None, mode=cayley (mismatch 100.0%) x: array([[ 0.23815693, -0.33630507], [-0.10785286, 0.02168 ], [ 0.12468344, -0.11036437],... y: array([[ 0.23815643, -0.2405392 ], [-0.10785349, 0.14390968], [ 0.12468311, -0.04574991],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'LA', None, 0.5, , None, 'normal') ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 235, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 1168, in assert_allclose verbose=verbose, header=header) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:standard, typ=f, which=LA, sigma=0.5, mattype=aslinearoperator, OPpart=None, mode=normal (mismatch 100.0%) x: array([[ 28.80129188, -0.6379945 ], [ 34.79312355, 0.27066791], [-270.23255444, 0.4851834 ],... y: array([[ 3.93467650e+03, -6.37994494e-01], [ 3.90913859e+03, 2.70667916e-01], [ -3.62176382e+04, 4.85183382e-01],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'SA', None, 0.5, , None, 'normal') ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 235, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 1168, in assert_allclose verbose=verbose, header=header) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:standard, typ=f, which=SA, sigma=0.5, mattype=aslinearoperator, OPpart=None, mode=normal (mismatch 100.0%) x: array([[ 0.26260981, 0.23815559], [-0.09760907, -0.10785484], [ 0.06149647, 0.12468203],... y: array([[ 0.23744165, 0.2381564 ], [-0.13633069, -0.10785359], [ 0.03132561, 0.12468301],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'SA', None, 0.5, , None, 'cayley') ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 235, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 1168, in assert_allclose verbose=verbose, header=header) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:standard, typ=f, which=SA, sigma=0.5, mattype=aslinearoperator, OPpart=None, mode=cayley (mismatch 100.0%) x: array([[ 0.29524244, -0.2381569 ], [-0.08169955, 0.10785299], [ 0.06645597, -0.12468332],... y: array([[ 0.24180251, -0.23815646], [-0.14191195, 0.10785349], [ 0.03568392, -0.12468307],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'SM', None, 0.5, , None, 'buckling') ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 235, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 1168, in assert_allclose verbose=verbose, header=header) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:general, typ=f, which=SM, sigma=0.5, mattype=aslinearoperator, OPpart=None, mode=buckling (mismatch 100.0%) x: array([[-0.10940548, 0.01676016], [-0.07154097, 0.4628113 ], [ 0.06895222, 0.49206394],... y: array([[-0.10940547, 0.05459438], [-0.07154103, 0.31407543], [ 0.06895217, 0.37578294],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'SA', None, 0.5, , None, 'cayley') ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 235, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 1168, in assert_allclose verbose=verbose, header=header) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-1.6.1-py2.7-macosx-10.6-intel.egg/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:general, typ=f, which=SA, sigma=0.5, mattype=aslinearoperator, OPpart=None, mode=cayley (mismatch 100.0%) x: array([[-0.4404992 , -0.01935683], [-0.25650678, -0.11053132], [-0.36893024, -0.13223556],... y: array([[-0.44017013, -0.0193569 ], [-0.25525379, -0.11053158], [-0.36818443, -0.13223571],... ---------------------------------------------------------------------- Ran 5101 tests in 84.089s FAILED (KNOWNFAIL=12, SKIP=42, failures=7) On Feb 13, 2012, at 1:02 AM, Ralf Gommers wrote: > > > On Sun, Feb 12, 2012 at 10:02 PM, Scott Lasley wrote: > I downloaded scipy-0.10.1rc1.tar.gz from sourceforge and installed it on a MacPro running OS X 10.7.3, XCode 4.2.1, gfortran from the R Project, numpy 1.6.1 from sourceforge, and python 2.7.2 from python.org with these commands > > export MACOSX_DEPLOYMENT_TARGET=10.7 > export CC=/usr/bin/gcc > python setupegg.py build --fcompiler=gfortran > python setupegg.py bdist_egg > unset MACOSX_DEPLOYMENT_TARGET > unset CC > easy_install -U dist/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg > > scipy crashes python and python-32 in TestDoubleIFFT. I get similar crashes with scipy 2.0.0.dev-8d1b91e and numpy 0.11.0.dev-912768f from github. > > This looks like http://projects.scipy.org/scipy/ticket/1496, using 0.10.0 with the above build recipe should crash in the same way. Can you comment on that ticket, double check that you're using non-LLVM gcc (looks unrelated, but you forgot to export CXX) and the correct fortran compiler (there are two from R-project, the one linked on the front page is the wrong one last time I checked), and try the recipe linked to there? > > OS X Lion build is still very painful, if someone wants to have a go at making numpy/scipy work with llvm-gcc that would be very helpful. > > Thanks, > Ralf > > > $ python > Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import scipy > >>> scipy.test(verbose=10) > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy > SciPy version 0.10.1rc1 > SciPy is installed in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy > Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] > nose version 1.1.2 > nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$'] > nose.config: INFO: Excluding tests matching ['f2py_ext', 'f2py_f90_ext', 'gen_ext', 'pyrex_ext', 'swig_ext'] > Tests cophenet(Z) on tdist data set. ... ok > ... > Check that updating stored values with exact ones worked. ... ok > test_constants.test_fahrenheit_to_celcius ... ok > test_constants.test_celcius_to_kelvin ... ok > test_constants.test_kelvin_to_celcius ... ok > test_constants.test_fahrenheit_to_kelvin ... ok > test_constants.test_kelvin_to_fahrenheit ... ok > test_constants.test_celcius_to_fahrenheit ... ok > test_constants.test_lambda_to_nu ... ok > test_constants.test_nu_to_lambda ... ok > test_definition (test_basic.TestDoubleFFT) ... ok > test_djbfft (test_basic.TestDoubleFFT) ... ok > test_n_argument_real (test_basic.TestDoubleFFT) ... ok > test_definition (test_basic.TestDoubleIFFT) ... FAIL > test_definition_real (test_basic.TestDoubleIFFT) ... ok > test_djbfft (test_basic.TestDoubleIFFT) ... FAIL > test_random_complex (test_basic.TestDoubleIFFT) ... FAIL > Python(68720) malloc: *** error for object 0x105312530: pointer being freed was not allocated > *** set a breakpoint in malloc_error_break to debug > Abort trap: 6 > $ > > $ python-32 > Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import scipy > >>> scipy.test(verbose=10) > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy > SciPy version 0.10.1rc1 > SciPy is installed in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy-0.10.1rc1-py2.7-macosx-10.6-intel.egg/scipy > Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] > nose version 1.1.2 > nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$'] > nose.config: INFO: Excluding tests matching ['f2py_ext', 'f2py_f90_ext', 'gen_ext', 'pyrex_ext', 'swig_ext'] > ... > test_definition (test_basic.TestDoubleFFT) ... ok > test_djbfft (test_basic.TestDoubleFFT) ... ok > test_n_argument_real (test_basic.TestDoubleFFT) ... ok > test_definition (test_basic.TestDoubleIFFT) ... Python(68743) malloc: *** error for object 0x3a33604: incorrect checksum for freed object - object was probably modified after being freed. > *** set a breakpoint in malloc_error_break to debug > FAIL > test_definition_real (test_basic.TestDoubleIFFT) ... ok > test_djbfft (test_basic.TestDoubleIFFT) ... Python(68743) malloc: *** error for object 0x3a2e2c4: incorrect checksum for freed object - object was probably modified after being freed. > *** set a breakpoint in malloc_error_break to debug > FAIL > test_random_complex (test_basic.TestDoubleIFFT) ... FAIL > test_random_real (test_basic.TestDoubleIFFT) ... Python(68743) malloc: *** error for object 0x354c604: incorrect checksum for freed object - object was probably modified after being freed. > *** set a breakpoint in malloc_error_break to debug > FAIL > test_size_accuracy (test_basic.TestDoubleIFFT) ... Bus error: 10 > $ From eckjoh2 at web.de Mon Feb 13 10:30:43 2012 From: eckjoh2 at web.de (Johannes Eckstein) Date: Mon, 13 Feb 2012 16:30:43 +0100 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: References: <33301423.post@talk.nabble.com> <4F38D0E8.4000005@web.de> Message-ID: <4F392CA3.4010500@web.de> Haha, Thanks Pauli maybe good, that I don't have that much memory. I just realized that I was confused by the indices... I had it the way round, before... another question: M = X.H * X with X being a de-trended process does give the (unscaled) covariance matrix? cheers, Johannes > 13.02.2012 09:59, Johannes Eckstein kirjoitti: > [clip] >> how can I do a matrix multiplication of a matrix X with shape: (240001, 4) >> >> M = X * X.H >> >> when I do this I get the following: >> return N.dot(self, asmatrix(other)) >> ValueError: array is too big. >> >> What is the best way to avoid this error? > The result would be a 240001 x 240001 matrix that consumes 430 GB of > memory. Do you really have that much available? > From josef.pktd at gmail.com Mon Feb 13 10:40:12 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 13 Feb 2012 10:40:12 -0500 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: <4F392CA3.4010500@web.de> References: <33301423.post@talk.nabble.com> <4F38D0E8.4000005@web.de> <4F392CA3.4010500@web.de> Message-ID: On Mon, Feb 13, 2012 at 10:30 AM, Johannes Eckstein wrote: > Haha, Thanks Pauli > > maybe good, that I don't have that much memory. > I just realized that I was confused by the indices... > I had it the way round, before... another question: > > M = X.H * X > > with X being a de-trended process does give the (unscaled) covariance > matrix? for linear least squares the unscaled covariance matrix is the inverse of dot(X.T, X) for nonlinear least squares X is replaced by the Jacobian. Josef > > cheers, Johannes >> 13.02.2012 09:59, Johannes Eckstein kirjoitti: >> [clip] >>> how can I do a matrix multiplication of a matrix X with shape: (240001, 4) >>> >>> M = X * X.H >>> >>> when I do this I get the following: >>> ? ? ?return N.dot(self, asmatrix(other)) >>> ValueError: array is too big. >>> >>> What is the best way to avoid this error? >> The result would be a 240001 x 240001 matrix that consumes 430 GB of >> memory. Do you really have that much available? >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Mon Feb 13 10:41:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 13 Feb 2012 10:41:10 -0500 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: References: <33301423.post@talk.nabble.com> <4F38D0E8.4000005@web.de> <4F392CA3.4010500@web.de> Message-ID: On Mon, Feb 13, 2012 at 10:40 AM, wrote: > On Mon, Feb 13, 2012 at 10:30 AM, Johannes Eckstein wrote: >> Haha, Thanks Pauli >> >> maybe good, that I don't have that much memory. >> I just realized that I was confused by the indices... >> I had it the way round, before... another question: >> >> M = X.H * X >> >> with X being a de-trended process does give the (unscaled) covariance >> matrix? > > for linear least squares the unscaled covariance matrix is the inverse > of dot(X.T, X) unscaled covariance of the parameter estimate, to be explicit Josef > > for nonlinear least squares X is replaced by the Jacobian. > > Josef > >> >> cheers, Johannes >>> 13.02.2012 09:59, Johannes Eckstein kirjoitti: >>> [clip] >>>> how can I do a matrix multiplication of a matrix X with shape: (240001, 4) >>>> >>>> M = X * X.H >>>> >>>> when I do this I get the following: >>>> ? ? ?return N.dot(self, asmatrix(other)) >>>> ValueError: array is too big. >>>> >>>> What is the best way to avoid this error? >>> The result would be a 240001 x 240001 matrix that consumes 430 GB of >>> memory. Do you really have that much available? >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From kevin.gullikson at gmail.com Sat Feb 11 19:39:20 2012 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Sat, 11 Feb 2012 18:39:20 -0600 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: <33301423.post@talk.nabble.com> References: <33301423.post@talk.nabble.com> Message-ID: Use full_output=True when you call leastq, and you will get a matrix (among other things). If you multiply that matrix by the standard deviation of the residuals, it will be the covariance matrix. Kevin Gullikson On Fri, Feb 10, 2012 at 10:11 AM, suzana8447 wrote: > > Hello every body, > I am using least square fit to fit some function to a given data. The fit > is > perfect with leastsq. Now, I need to calculate the covariance matrix > whereby > the diagonal terms represent the variances for the parameters. > > I need to know, if possible, how to extract the covariance matrix from > leastsq. If there is no way to extract it, Are there any good methods that > can be used to calculate the covariance matrix with high precision? > > Thanks in advance. > -- > View this message in context: > http://old.nabble.com/Covariance-matrix-tp33301423p33301423.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elfnor at gmail.com Sun Feb 12 18:11:35 2012 From: elfnor at gmail.com (Elfnor) Date: Sun, 12 Feb 2012 15:11:35 -0800 (PST) Subject: [SciPy-User] [SciPy-user] scipy.odr - Correct form of covx, covy matrix Message-ID: <33312093.post@talk.nabble.com> I'm having trouble using covariance matrices with odrpack RealData I expected the following adaptation of the test in test_odr.py to work but get the error File "C:\Python27\lib\site-packages\scipy\odr\odrpack.py", line 1080, in run self.output = Output(apply(odr, args, kwds)) ValueError: could not convert we to a suitable array _________________________ import numpy as np from scipy.odr.odrpack import * p_x = np.array([0.,.9,1.8,2.6,3.3,4.4,5.2,6.1,6.5,7.4]) p_y = np.array([5.9,5.4,4.4,4.6,3.5,3.7,2.8,2.8,2.4,1.5]) p_sx = np.array([.03,.03,.04,.035,.07,.11,.13,.22,.74,1.]) p_sy = np.array([1.,.74,.5,.35,.22,.22,.12,.12,.1,.04]) def pearson_fcn(B, x): return B[0] + B[1]*x covp_x = np.diag(p_sx**2) covp_y = np.diag(p_sy**2) p_dat = RealData(p_x, p_y, covx=covp_x, covy=covp_y) p_mod = Model(pearson_fcn, meta=dict(name='Uni-linear Fit')) p_odr = ODR(p_dat, p_mod, beta0=[1.,1.]) out = p_odr.run() out.pprint() _________________________________ Does anyone know the correct matrix form for this? thanks Eleanor -- View this message in context: http://old.nabble.com/scipy.odr---Correct-form-of-covx%2C-covy-matrix-tp33312093p33312093.html Sent from the Scipy-User mailing list archive at Nabble.com. From josef.pktd at gmail.com Mon Feb 13 10:56:06 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 13 Feb 2012 10:56:06 -0500 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: References: <33301423.post@talk.nabble.com> Message-ID: On Sat, Feb 11, 2012 at 7:39 PM, Kevin Gullikson wrote: > Use full_output=True when you call leastq, and you will get a matrix (among > other things). If you multiply that matrix by the standard deviation of the > residuals, it will be the covariance matrix. As Charles pointed out, multiply by the error variance not the standard deviation. Docstring is wrong in this. Josef > > Kevin Gullikson > > > > > On Fri, Feb 10, 2012 at 10:11 AM, suzana8447 wrote: >> >> >> Hello every body, >> I am using least square fit to fit some function to a given data. The fit >> is >> perfect with leastsq. Now, I need to calculate the covariance matrix >> whereby >> the diagonal terms represent the variances for the parameters. >> >> I need to know, if possible, how to extract the covariance matrix from >> leastsq. If there is no way to extract it, Are there any good methods that >> can be used to calculate the covariance matrix with high precision? >> >> Thanks in advance. >> -- >> View this message in context: >> http://old.nabble.com/Covariance-matrix-tp33301423p33301423.html >> Sent from the Scipy-User mailing list archive at Nabble.com. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From robert.kern at gmail.com Mon Feb 13 10:59:35 2012 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 13 Feb 2012 15:59:35 +0000 Subject: [SciPy-User] [SciPy-user] scipy.odr - Correct form of covx, covy matrix In-Reply-To: <33312093.post@talk.nabble.com> References: <33312093.post@talk.nabble.com> Message-ID: On Sun, Feb 12, 2012 at 23:11, Elfnor wrote: > > I'm having trouble using covariance matrices with odrpack RealData > > I expected the following adaptation of the test in test_odr.py to work but > get the error > > ?File "C:\Python27\lib\site-packages\scipy\odr\odrpack.py", line 1080, in > run > ? ?self.output = Output(apply(odr, args, kwds)) > ValueError: could not convert we to a suitable array > _________________________ > import numpy as np > from ?scipy.odr.odrpack import * > > p_x = np.array([0.,.9,1.8,2.6,3.3,4.4,5.2,6.1,6.5,7.4]) > p_y = np.array([5.9,5.4,4.4,4.6,3.5,3.7,2.8,2.8,2.4,1.5]) > p_sx = np.array([.03,.03,.04,.035,.07,.11,.13,.22,.74,1.]) > p_sy = np.array([1.,.74,.5,.35,.22,.22,.12,.12,.1,.04]) > > def pearson_fcn(B, x): > ? ? ? ?return B[0] + B[1]*x > > covp_x = np.diag(p_sx**2) > covp_y = np.diag(p_sy**2) > > p_dat = RealData(p_x, p_y, covx=covp_x, covy=covp_y) covx and covy aren't covariance matrices across observations, i.e. not the 10x10 matrices that you have here. Rather, what you need to provide is a length-10 array of covariance matrices, one for each observation. For the case where each of your observations are 1-dimensional, you don't construct covariance matrices at all, just provide the standard deviations: p_dat = RealData(p_x, p_y, sx=p_sx, sy=p_sy) https://github.com/scipy/scipy/blob/master/scipy/odr/odrpack.py#L27 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From josef.pktd at gmail.com Mon Feb 13 11:04:23 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 13 Feb 2012 11:04:23 -0500 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: References: <33301423.post@talk.nabble.com> Message-ID: On Mon, Feb 13, 2012 at 10:56 AM, wrote: > On Sat, Feb 11, 2012 at 7:39 PM, Kevin Gullikson > wrote: >> Use full_output=True when you call leastq, and you will get a matrix (among >> other things). If you multiply that matrix by the standard deviation of the >> residuals, it will be the covariance matrix. > > As Charles pointed out, multiply by the error variance not the > standard deviation. Docstring is wrong in this. I finally fixed this http://docs.scipy.org/scipy/docs/scipy.optimize.minpack.leastsq/diff/6184/7387/ Josef > > Josef > >> >> Kevin Gullikson >> >> >> >> >> On Fri, Feb 10, 2012 at 10:11 AM, suzana8447 wrote: >>> >>> >>> Hello every body, >>> I am using least square fit to fit some function to a given data. The fit >>> is >>> perfect with leastsq. Now, I need to calculate the covariance matrix >>> whereby >>> the diagonal terms represent the variances for the parameters. >>> >>> I need to know, if possible, how to extract the covariance matrix from >>> leastsq. If there is no way to extract it, Are there any good methods that >>> can be used to calculate the covariance matrix with high precision? >>> >>> Thanks in advance. >>> -- >>> View this message in context: >>> http://old.nabble.com/Covariance-matrix-tp33301423p33301423.html >>> Sent from the Scipy-User mailing list archive at Nabble.com. >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From charlesr.harris at gmail.com Mon Feb 13 11:18:48 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 13 Feb 2012 09:18:48 -0700 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: References: <33301423.post@talk.nabble.com> Message-ID: On Mon, Feb 13, 2012 at 8:56 AM, wrote: > On Sat, Feb 11, 2012 at 7:39 PM, Kevin Gullikson > wrote: > > Use full_output=True when you call leastq, and you will get a matrix > (among > > other things). If you multiply that matrix by the standard deviation of > the > > residuals, it will be the covariance matrix. > > As Charles pointed out, multiply by the error variance not the > standard deviation. Docstring is wrong in this. > > Even more precisely, multiply by ||err||^2/(n - dof), since it is possible that the error has an offset unless the model can perfectly fit a constant. If this actually makes a difference, the model is inadequate, but the variance estimate might be useful if you are using something like the Akaike information criterion to choose the number of parameters. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From surfcast23 at gmail.com Mon Feb 13 23:29:01 2012 From: surfcast23 at gmail.com (Khary Richardson) Date: Mon, 13 Feb 2012 23:29:01 -0500 Subject: [SciPy-User] ImportError: DLL load failed: The specified module could not be found Message-ID: 0 down vote favorite share [g+] share [fb] share [tw] Hi I downloaded and ran the numpy-1.6.1-win32-superpack-python3.2.exe, and I installed scipy-0.10.0.win32-py3.2 from the scipy sourceforge site. When I try to run a code that uses scipy.special I get the following error Traceback (most recent call last): File "C:\Documents and Settings\Khary\My Documents\PHYSICS\ Physics\Bessel.py", line 5, in from scipy.special import jv, jvp File "C:\Python32\lib\site-packages\scipy\special\__init__.py", line 525, in from ._cephes import * ImportError: DLL load failed: The specified module could not be found. any help woulb be great. -- StriperCoast SurfCasters Club -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Mon Feb 13 16:55:45 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 13 Feb 2012 13:55:45 -0800 Subject: [SciPy-User] Discussion with Guido van Rossum and (hopefully) core python-dev on scientific Python and Python3 Message-ID: Hi folks, [ I'm broadcasting this widely for maximum reach, but I'd appreciate it if replies can be kept to the *numpy* list, which is sort of the 'base' list for scientific/numerical work. It will make it much easier to organize a coherent set of notes later on. Apology if you're subscribed to all and get it 10 times. ] As part of the PyData workshop (http://pydataworkshop.eventbrite.com) to be held March 2 and 3 at the Mountain View Google offices, we have scheduled a session for an open discussion with Guido van Rossum and hopefully as many core python-dev members who can make it. We wanted to seize the combined opportunity of the PyData workshop bringing a number of 'scipy people' to Google with the timeline for Python 3.3, the first release after the Python language moratorium, being within sight: http://www.python.org/dev/peps/pep-0398. While a number of scientific Python packages are already available for Python 3 (either in released form or in their master git branches), it's fair to say that there hasn't been a major transition of the scientific community to Python3. Since there is no more development being done on the Python2 series, eventually we will all want to find ways to make this transition, and we think that this is an excellent time to engage the core python development team and consider ideas that would make Python3 generally a more appealing language for scientific work. Guido has made it clear that he doesn't speak for the day-to-day development of Python anymore, so we all should be aware that any ideas that come out of this panel will still need to be discussed with python-dev itself via standard mechanisms before anything is implemented. Nonetheless, the opportunity for a solid face-to-face dialog for brainstorming was too good to pass up. The purpose of this email is then to solicit, from all of our community, ideas for this discussion. In a week or so we'll need to summarize the main points brought up here and make a more concrete agenda out of it; I will also post a summary of the meeting afterwards here. Anything is a valid topic, some points just to get the conversation started: - Extra operators/PEP 225. Here's a summary from the last time we went over this, years ago at Scipy 2008: http://mail.scipy.org/pipermail/numpy-discussion/2008-October/038234.html, and the current status of the document we wrote about it is here: file:///home/fperez/www/site/_build/html/py4science/numpy-pep225/numpy-pep225.html. - Improved syntax/support for rationals or decimal literals? While Python now has both decimals (http://docs.python.org/library/decimal.html) and rationals (http://docs.python.org/library/fractions.html), they're quite clunky to use because they require full constructor calls. Guido has mentioned in previous discussions toying with ideas about support for different kinds of numeric literals... - Using the numpy docstring standard python-wide, and thus having python improve the pathetic state of the stdlib's docstrings? This is an area where our community is light years ahead of the standard library, but we'd all benefit from Python itself improving on this front. I'm toying with the idea of giving a lighting talk at PyConn about this, comparing the great, robust culture and tools of good docstrings across the Scipy ecosystem with the sad, sad state of docstrings in the stdlib. It might spur some movement on that front from the stdlib authors, esp. if the core python-dev team realizes the value and benefit it can bring (at relatively low cost, given how most of the information does exist, it's just in the wrong places). But more importantly for us, if there was truly a universal standard for high-quality docstrings across Python projects, building good documentation/help machinery would be a lot easier, as we'd know what to expect and search for (such as rendering them nicely in the ipython notebook, providing high-quality cross-project help search, etc). - Literal syntax for arrays? Sage has been floating a discussion about a literal matrix syntax (https://groups.google.com/forum/#!topic/sage-devel/mzwepqZBHnA). For something like this to go into python in any meaningful way there would have to be core multidimensional arrays in the language, but perhaps it's time to think about a piece of the numpy array itself into Python? This is one of the more 'out there' ideas, but after all, that's the point of a discussion like this, especially considering we'll have both Travis and Guido in one room. - Other syntactic sugar? Sage has "a..b" <=> range(a, b+1), which I actually think is both nice and useful... There's also the question of allowing "a:b:c" notation outside of [], which has come up a few times in conversation over the last few years. Others? - The packaging quagmire? This continues to be a problem, though python3 does have new improvements to distutils. I'm not really up to speed on the situation, to be frank. If we want to bring this up, someone will have to provide a solid reference or volunteer to do it in person. - etc... I'm putting the above just to *start* the discussion, but the real point is for the rest of the community to contribute ideas, so don't be shy. Final note: while I am here commiting to organizing and presenting this at the discussion with Guido (as well as contacting python-dev), I would greatly appreciate help with the task of summarizing this prior to the meeting as I'm pretty badly swamped in the run-in to pydata/pycon. So if anyone is willing to help draft the summary as the date draws closer (we can put it up on a github wiki, gist, whatever), I will be very grateful. I'm sure it will be better than what I'll otherwise do the last night at 2am :) Cheers, f ps - to the obvious question about webcasting the discussion live for remote participation: yes, we looked into it already; no, unfortunately it appears it won't be possible. We'll try to at least have the audio recorded (and possibly video) for posting later on. pps- if you are close to Mountain View and are interested in attending this panel in person, drop me a line at fernando.perez at berkeley.edu. We have a few spots available *for this discussion only* on top of the pydata regular attendance (which is long closed, I'm afraid). But we'll need to provide Google with a list of those attendees in advance. Please indicate if you are a core python committer in your email, as we'll give priority for this overflow pool to core python developers (but will otherwise accommodate as many people as Google lets us). From elfnor at gmail.com Mon Feb 13 21:38:03 2012 From: elfnor at gmail.com (Elfnor) Date: Mon, 13 Feb 2012 18:38:03 -0800 (PST) Subject: [SciPy-User] [SciPy-user] scipy.odr - Correct form of covx, covy matrix In-Reply-To: References: <33312093.post@talk.nabble.com> Message-ID: <33319485.post@talk.nabble.com> Robert Kern-2 wrote: > > > covx and covy aren't covariance matrices across observations, i.e. not > the 10x10 matrices that you have here. Rather, what you need to > provide is a length-10 array of covariance matrices, one for each > observation. For the case where each of your observations are > 1-dimensional, you don't construct covariance matrices at all, just > provide the standard deviations: > > p_dat = RealData(p_x, p_y, sx=p_sx, sy=p_sy) > > https://github.com/scipy/scipy/blob/master/scipy/odr/odrpack.py#L27 > > -- > OK. I gave an example with diagonal covariance matrices to make the problem easy to explain. covy in ODR I see would be 10 2 by 2 matrices for the pearson's example, each giving the covaraiance between the parameters at that observation. I am trying to solve a generalised least squares problem of the form A*x = b where A is the problem design matrix and b is a vector of correlated observations with covariance V. V is not diagonal. That is find x that minimizes (b - A*x)'*inv(V)*(b - A*x) similar to the matlab function lscov. I'm now looking at the GLS fit from scikits.statsmodels. Thanks Eleanor -- View this message in context: http://old.nabble.com/scipy.odr---Correct-form-of-covx%2C-covy-matrix-tp33312093p33319485.html Sent from the Scipy-User mailing list archive at Nabble.com. From robert.kern at gmail.com Tue Feb 14 09:29:17 2012 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 14 Feb 2012 14:29:17 +0000 Subject: [SciPy-User] [SciPy-user] scipy.odr - Correct form of covx, covy matrix In-Reply-To: <33319485.post@talk.nabble.com> References: <33312093.post@talk.nabble.com> <33319485.post@talk.nabble.com> Message-ID: On Tue, Feb 14, 2012 at 02:38, Elfnor wrote: > I am trying to solve a generalised least squares problem of the form A*x = b > where A is the problem design matrix and b is a vector of correlated > observations with covariance V. V is not diagonal. > > That is find x that minimizes (b - A*x)'*inv(V)*(b - A*x) similar to the > matlab function lscov. I think you can distribute the inv(V) by doing a Cholesky factorization, then doing a Cholesky-solve on both A and b, then doing a linear solve on the transformed A and b. from scipy import linalg cho = linalg.cho_factor(V) Ac = linalg.cho_solve(cho, A) bc = linalg.cho_solve(cho, b) x = linalg.solve(Ac, bc) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From josef.pktd at gmail.com Tue Feb 14 09:55:47 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Feb 2012 09:55:47 -0500 Subject: [SciPy-User] [SciPy-user] scipy.odr - Correct form of covx, covy matrix In-Reply-To: References: <33312093.post@talk.nabble.com> <33319485.post@talk.nabble.com> Message-ID: On Tue, Feb 14, 2012 at 9:29 AM, Robert Kern wrote: > On Tue, Feb 14, 2012 at 02:38, Elfnor wrote: > >> I am trying to solve a generalised least squares problem of the form A*x = b >> where A is the problem design matrix and b is a vector of correlated >> observations with covariance V. V is not diagonal. to be precise V is the correlation of the error b-A*x, I assume that's what you mean. >> >> That is find x that minimizes (b - A*x)'*inv(V)*(b - A*x) similar to the >> matlab function lscov. > > I think you can distribute the inv(V) by doing a Cholesky > factorization, then doing a Cholesky-solve on both A and b, then doing > a linear solve on the transformed A and b. > > ?from scipy import linalg > > ?cho = linalg.cho_factor(V) > ?Ac = linalg.cho_solve(cho, A) > ?bc = linalg.cho_solve(cho, b) > ?x = linalg.solve(Ac, bc) That's pretty much what statsmodels GLS does. The transformed variables have a w prefix, wendog, wexog, wresid I don't think we have many examples with the full sized (nobs, nobs) V matrix. Josef > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sturla at molden.no Tue Feb 14 10:20:00 2012 From: sturla at molden.no (Sturla Molden) Date: Tue, 14 Feb 2012 16:20:00 +0100 Subject: [SciPy-User] [SciPy-user] scipy.odr - Correct form of covx, covy matrix In-Reply-To: References: <33312093.post@talk.nabble.com> <33319485.post@talk.nabble.com> Message-ID: <4F3A7BA0.9010900@molden.no> On 14.02.2012 15:55, josef.pktd at gmail.com wrote: >> from scipy import linalg >> >> cho = linalg.cho_factor(V) >> Ac = linalg.cho_solve(cho, A) >> bc = linalg.cho_solve(cho, b) >> x = linalg.solve(Ac, bc) > > That's pretty much what statsmodels GLS does. The transformed > variables have a w prefix, wendog, wexog, wresid > > I don't think we have many examples with the full sized (nobs, nobs) V matrix. Why not call lapack DGGGLM? It does the same, except with QR. It could e.g. be exposed as sp.linalg.glstsq. Sturla From LEisenman at wustl.edu Tue Feb 14 10:15:11 2012 From: LEisenman at wustl.edu (Larry Eisenman) Date: Tue, 14 Feb 2012 15:15:11 +0000 (UTC) Subject: [SciPy-User] [ANN] release of Neo 0.2.0 References: <4F38D337.7000703@olfac.univ-lyon1.fr> Message-ID: Samuel Garcia olfac.univ-lyon1.fr> writes: > > > Dear scipy list, > We are proud to announce the 0.2.0 release of?Neo, a Python library for working with > electrophysiology data, whether from biological experiments or from > simulations. .... How does this compare/relate to the Nitime module of the Nipy project (http://nipy.sourceforge.net/nitime/)? Larry From josef.pktd at gmail.com Tue Feb 14 11:12:40 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Feb 2012 11:12:40 -0500 Subject: [SciPy-User] [SciPy-user] scipy.odr - Correct form of covx, covy matrix In-Reply-To: <4F3A7BA0.9010900@molden.no> References: <33312093.post@talk.nabble.com> <33319485.post@talk.nabble.com> <4F3A7BA0.9010900@molden.no> Message-ID: On Tue, Feb 14, 2012 at 10:20 AM, Sturla Molden wrote: > On 14.02.2012 15:55, josef.pktd at gmail.com wrote: > >>> ? from scipy import linalg >>> >>> ? cho = linalg.cho_factor(V) >>> ? Ac = linalg.cho_solve(cho, A) >>> ? bc = linalg.cho_solve(cho, b) >>> ? x = linalg.solve(Ac, bc) >> >> That's pretty much what statsmodels GLS does. The transformed >> variables have a w prefix, wendog, wexog, wresid >> >> I don't think we have many examples with the full sized (nobs, nobs) V matrix. > > Why not call lapack DGGGLM? It does the same, except with QR. I didn't know about it and it's not yet available > > It could e.g. be exposed as sp.linalg.glstsq. volunteers? One problem we have in statsmodels is that we still need additional results not just the parameter estimates, for example inv(x'x) (or pinv) or inv(X' inv(V) X) (IIRC) for the covariance matrix and the determinant for the log likelihood. I tried to work on the linear algebra a few times to avoid some repeated and some numerically doubtful calculations, but it's too much work (when I'd rather work on something else). Now we are just using pinv or QR for the least squares solution and cholesky for the transformation. Josef > > Sturla > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sturla at molden.no Tue Feb 14 12:51:01 2012 From: sturla at molden.no (Sturla Molden) Date: Tue, 14 Feb 2012 18:51:01 +0100 Subject: [SciPy-User] [SciPy-user] scipy.odr - Correct form of covx, covy matrix In-Reply-To: References: <33312093.post@talk.nabble.com> <33319485.post@talk.nabble.com> <4F3A7BA0.9010900@molden.no> Message-ID: <4F3A9F05.4080905@molden.no> On 14.02.2012 17:12, josef.pktd at gmail.com wrote: > One problem we have in statsmodels is that we still need additional > results not just the parameter estimates, for example inv(x'x) (or > pinv) or inv(X' inv(V) X) (IIRC) for the covariance matrix and the > determinant for the log likelihood. After fitting with dggglm the y output variable contain the decorrelated residuals, so that should give you the log likelihood. But I am not sure about the covariance matrix. Sturla From sturla at molden.no Tue Feb 14 14:40:26 2012 From: sturla at molden.no (Sturla Molden) Date: Tue, 14 Feb 2012 20:40:26 +0100 Subject: [SciPy-User] [OT] Bayesian vs. frequentist Message-ID: <4F3AB8AA.2040402@molden.no> After having worked with applied statistics for ~15 years, I have reached this conclusion... ;-) Sturla's 20 propositions on Bayesian vs. classical statistics: ============================================================== 1. For simple data, a figure is sufficient, nobody really cares. 2. For dummy problems with known facit, Bayesian methods tend to be the more accurate. 3. Bayesian methods include prior knowlege. A horse of 400 g is a priori less likely than a horse of 400 kg. Frequentists say this is too subjective. 4. Bayesian methods are easier to interpret. Few understand a frequenctist confidence interval, albeit everybody they think they do. 5. Hypothesis testing: Bayesians answer the question we ask. Freuentists don't. 6. Economists investing their own money are bayesians. 7. Economists investing your money are frequentists. 8. For basic medical research, nobody cares. 9. Drug trials: For getting an FDA application approved, frequentists often yield a more 'significant result'. 10. Drug trials: For in-hose liability estimates, Bayesian methods are the safer. 11. Frequentists can always get more significant results by "sampling more data". 12. Frequentists don't care about stopping rules, even though they should. 13. Bayesians don't care about stopping rules bacause they don't have to. 14. "Significant" does not mean "important". Any tiny difference can be made statistically significant. 15. For interpreting clinical lab tests, Bayesian methods prevail, e.g. predictive values. 16. Engineers who know their mathematics use Bayesian methods. 17. Social scientists who don't know their mathematics are frequentists. 18. SPSS, Excel, Minitab, and SAS make it easy to be an ignorant frequentist. 19. No tool make it easy to be an ignorant bayesian. 20. Competent analysts use R, Fortran, Matlab or Python. From scipy at samueljohn.de Tue Feb 14 14:49:40 2012 From: scipy at samueljohn.de (Samuel John) Date: Tue, 14 Feb 2012 20:49:40 +0100 Subject: [SciPy-User] ANN: SciPy 0.10.1 release candidate 1 In-Reply-To: References: Message-ID: <20494732-5882-4B0B-8A99-280D5527A74B@samueljohn.de> On 13.02.2012, at 07:02, Ralf Gommers wrote: > OS X Lion build is still very painful, if someone wants to have a go at making numpy/scipy work with llvm-gcc that would be very helpful. I just tested clang with great success! With llvm-gcc I got segfaults in scipy. But not with clang. Both, scipy 0.10.0 and numpy 1.6.1 compile with on OS X 10.7.3 with latest XCode: ```` export CC=clang export CXX=clang++ export FFLAGS='-ff2c' python setup.py build --fcompiler=gfortran python setup.py install ```` Just the known arpack related scipy failuers and umath_complex related failures numpy failures. The only numpy failure which I am not sure about is: FAIL: Test basic arithmetic function errors AssertionError: Type did not raise fpe error ''. Does someone want to comment on clang or test it on Mac OS X (Lion) ? cheers, Samuel From josef.pktd at gmail.com Tue Feb 14 15:24:57 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Feb 2012 15:24:57 -0500 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3AB8AA.2040402@molden.no> References: <4F3AB8AA.2040402@molden.no> Message-ID: On Tue, Feb 14, 2012 at 2:40 PM, Sturla Molden wrote: > > After having worked with applied statistics for ~15 years, I have > reached this conclusion... ;-) Do you expect an argument? sounds a bit like http://andrewgelman.com/ Even 20 years ago when I was a Bayesian, I didn't understand what most of the excitement in the religious differences between Bayesian and Frequentist fundamentalists was all about. ;) Josef If you have a hammer, everything looks like a nail. If you have a screw driver, everything looks like a screw. (What's difference in differences, and what's generalized method of moments. Two highlights from Gelman.) > > > Sturla's 20 propositions on Bayesian vs. classical statistics: > ============================================================== > > 1. For simple data, a figure is sufficient, nobody really cares. > > 2. For dummy problems with known facit, Bayesian methods tend to be the > more accurate. > > 3. Bayesian methods include prior knowlege. A horse of 400 g is a priori > less likely than a horse of 400 kg. Frequentists say this is too subjective. > > 4. Bayesian methods are easier to interpret. Few understand a > frequenctist confidence interval, albeit everybody they think they do. > > 5. Hypothesis testing: Bayesians answer the question we ask. Freuentists > don't. > > 6. Economists investing their own money are bayesians. > > 7. Economists investing your money are frequentists. > > 8. For basic medical research, nobody cares. > > 9. Drug trials: For getting an FDA application approved, frequentists > often yield a more 'significant result'. > > 10. Drug trials: For in-hose liability estimates, Bayesian methods are > the safer. > > 11. Frequentists can always get more significant results by "sampling > more data". > > 12. Frequentists don't care about stopping rules, even though they should. > > 13. Bayesians don't care about stopping rules bacause they don't have to. > > 14. "Significant" does not mean "important". Any tiny difference can be > made statistically significant. > > 15. For interpreting clinical lab tests, Bayesian methods prevail, e.g. > predictive values. > > 16. Engineers who know their mathematics use Bayesian methods. > > 17. Social scientists who don't know their mathematics are frequentists. > > 18. SPSS, Excel, Minitab, and SAS make it easy to be an ignorant > frequentist. > > 19. No tool make it easy to be an ignorant bayesian. > > 20. Competent analysts use R, Fortran, Matlab or Python. > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pav at iki.fi Tue Feb 14 15:21:55 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 14 Feb 2012 21:21:55 +0100 Subject: [SciPy-User] ANN: SciPy 0.10.1 release candidate 1 In-Reply-To: <20494732-5882-4B0B-8A99-280D5527A74B@samueljohn.de> References: <20494732-5882-4B0B-8A99-280D5527A74B@samueljohn.de> Message-ID: 14.02.2012 20:49, Samuel John kirjoitti: [clip] > Just the known arpack related scipy failuers and umath_complex related failures numpy failures. The bad news is that we though the ARPACK issues were fixed :/ Could you paste the relevant pieces from the test report? The error is probably with test tolerances when doing iterative single-precision inverses inside iterative eigenvalue calculation, but I'd like to double check. Thanks, -- Pauli Virtanen From sturla at molden.no Tue Feb 14 15:39:48 2012 From: sturla at molden.no (Sturla Molden) Date: Tue, 14 Feb 2012 21:39:48 +0100 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: References: <4F3AB8AA.2040402@molden.no> Message-ID: <4F3AC694.8080205@molden.no> On 14.02.2012 21:24, josef.pktd at gmail.com wrote: > Do you expect an argument? sounds a bit like http://andrewgelman.com/ No I don't. I am just tired of explaining that "significant p-value" does not imply "very important effect". Sorry for spamming the list with my rant. Sturla From scipy at samueljohn.de Tue Feb 14 15:45:00 2012 From: scipy at samueljohn.de (Samuel John) Date: Tue, 14 Feb 2012 21:45:00 +0100 Subject: [SciPy-User] ANN: SciPy 0.10.1 release candidate 1 In-Reply-To: References: <20494732-5882-4B0B-8A99-280D5527A74B@samueljohn.de> Message-ID: <6C9EBB86-59C3-450B-B1E7-1388E18A3099@samueljohn.de> On 14.02.2012, at 21:21, Pauli Virtanen wrote: > 14.02.2012 20:49, Samuel John kirjoitti: > [clip] >> Just the known arpack related scipy failuers and umath_complex related failures numpy failures. > > The bad news is that we though the ARPACK issues were fixed :/ > > Could you paste the relevant pieces from the test report? The error is > probably with test tolerances when doing iterative single-precision > inverses inside iterative eigenvalue calculation, but I'd like to double > check. Sorry, my fault. I used http://sourceforge.net/projects/scipy/files/scipy/0.10.0/scipy-0.10.0.tar.gz and not later, so the arpack fixes were probably not included. I'll just repeat the build process with the latest head rev. From paustin at eos.ubc.ca Tue Feb 14 16:22:20 2012 From: paustin at eos.ubc.ca (Phil Austin) Date: Tue, 14 Feb 2012 13:22:20 -0800 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3AC694.8080205@molden.no> References: <4F3AB8AA.2040402@molden.no> <4F3AC694.8080205@molden.no> Message-ID: <4F3AD08C.50906@eos.ubc.ca> On 12-02-14 12:39 PM, Sturla Molden wrote: > On 14.02.2012 21:24, josef.pktd at gmail.com wrote: > > > Do you expect an argument? sounds a bit like http://andrewgelman.com/ > Coincidentally, this discussion: http://andrewgelman.com/2012/02/adding-an-error-model-to-a-deterministic-model/ started when a civil engineering PhD posted a request for help. My reading of the ensuing discussion of both posts is that there is still a lot of work to do in bridging statistics (bayesian or frequentist) and deterministic modeling of complex systems. -- Phil From scipy at samueljohn.de Tue Feb 14 16:54:00 2012 From: scipy at samueljohn.de (Samuel John) Date: Tue, 14 Feb 2012 22:54:00 +0100 Subject: [SciPy-User] ANN: SciPy 0.10.1 release candidate 1 In-Reply-To: <6C9EBB86-59C3-450B-B1E7-1388E18A3099@samueljohn.de> References: <20494732-5882-4B0B-8A99-280D5527A74B@samueljohn.de> <6C9EBB86-59C3-450B-B1E7-1388E18A3099@samueljohn.de> Message-ID: [sorry posting to scipy-users and scipy-dev. I feel this is not a good idea, but this thread is already spanning both lists. Replay on scipy-dev pls] The good news is that using clang and clang++ and gfortran from http://r.research.att.com/tools/ (4.2.4-5666.3) numpy and scipy build and test() fine! Yeeha \o/ Anyone with deeper understanding of the scipy internals want to comment on clang usage? Has anyone experienced problems with scipy and clang? However, building numpy 1.6.1 and scipy 0.10.1rc1 (or 0.10.0 or head) on OS X 10.7.3 with Xcode 4.2.1 (build 4D502) with llvm-gcc (which is the default, since non-llvm gcc is not available any more with XCode 4.2) leads to a segfault or to a malloc trap: > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in /usr/local/Cellar/python/2.7.2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy > SciPy version 0.10.1rc1 > SciPy is installed in /usr/local/Cellar/python/2.7.2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy > Python version 2.7.2 (default, Feb 14 2012, 22:09:10) [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.1.00)] > nose version 1.1.2 > ...................................................................................................................................................................................F.FFFPython(13536,0x7fff7c026960) malloc: *** error for object 0x10740b368: incorrect checksum for freed object - object was probably modified after being freed. > *** set a breakpoint in malloc_error_break to debug > Abort trap: 6 As pip install will also use the default llvm-gcc this might be a severe issue! This is already the case right now but does not often show up in daily usage. Perhaps it's possible to set the shell vars CC and CXX during build? Concerning arpack: I'm afraid that the arpack issues prevail in 0.10.1rc1. I beg you to fix these (I am not able to). scipy.test() log at https://gist.github.com/1830780 Samuel From travis at continuum.io Tue Feb 14 16:56:22 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 14 Feb 2012 15:56:22 -0600 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3AB8AA.2040402@molden.no> References: <4F3AB8AA.2040402@molden.no> Message-ID: can I frame this :-) +10 On Feb 14, 2012, at 1:40 PM, Sturla Molden wrote: > > After having worked with applied statistics for ~15 years, I have > reached this conclusion... ;-) > > > Sturla's 20 propositions on Bayesian vs. classical statistics: > ============================================================== > > 1. For simple data, a figure is sufficient, nobody really cares. > > 2. For dummy problems with known facit, Bayesian methods tend to be the > more accurate. > > 3. Bayesian methods include prior knowlege. A horse of 400 g is a priori > less likely than a horse of 400 kg. Frequentists say this is too subjective. > > 4. Bayesian methods are easier to interpret. Few understand a > frequenctist confidence interval, albeit everybody they think they do. > > 5. Hypothesis testing: Bayesians answer the question we ask. Freuentists > don't. > > 6. Economists investing their own money are bayesians. > > 7. Economists investing your money are frequentists. > > 8. For basic medical research, nobody cares. > > 9. Drug trials: For getting an FDA application approved, frequentists > often yield a more 'significant result'. > > 10. Drug trials: For in-hose liability estimates, Bayesian methods are > the safer. > > 11. Frequentists can always get more significant results by "sampling > more data". > > 12. Frequentists don't care about stopping rules, even though they should. > > 13. Bayesians don't care about stopping rules bacause they don't have to. > > 14. "Significant" does not mean "important". Any tiny difference can be > made statistically significant. > > 15. For interpreting clinical lab tests, Bayesian methods prevail, e.g. > predictive values. > > 16. Engineers who know their mathematics use Bayesian methods. > > 17. Social scientists who don't know their mathematics are frequentists. > > 18. SPSS, Excel, Minitab, and SAS make it easy to be an ignorant > frequentist. > > 19. No tool make it easy to be an ignorant bayesian. > > 20. Competent analysts use R, Fortran, Matlab or Python. > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Tue Feb 14 20:25:19 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Feb 2012 20:25:19 -0500 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3AD08C.50906@eos.ubc.ca> References: <4F3AB8AA.2040402@molden.no> <4F3AC694.8080205@molden.no> <4F3AD08C.50906@eos.ubc.ca> Message-ID: On Tue, Feb 14, 2012 at 4:22 PM, Phil Austin wrote: > On 12-02-14 12:39 PM, Sturla Molden wrote: >> ?On 14.02.2012 21:24, josef.pktd at gmail.com wrote: >> >> > Do you expect an argument? sounds a bit like http://andrewgelman.com/ >> > > Coincidentally, this discussion: > http://andrewgelman.com/2012/02/adding-an-error-model-to-a-deterministic-model/ > started when a civil engineering PhD posted a request for help. ?My reading > of the ensuing discussion of both posts is that there is still a lot of > work to > do in bridging statistics (bayesian or frequentist) and deterministic > modeling > of complex systems. I don't quite see why there should be anything deterministic (in the sense of correctly described by a mathematical model) about the growth of bacteria and the response of living tissue, (as there is nothing deterministic in the behavior of the macro economy). In economics we just add a noise variable (unexplained environmental or behavioral shocks) everywhere. I thought these were exactly the kind of dynamic problems that Kalman Filter (or it's nonlinear successors) were invented for. My main impression of the two articles and discussion is that being a Bayesian is a lot of work if you need to have a fully specified prior and likelihood, instead of just working with some semi-parametric estimation method (like least squares) that still produces results even if you don't have a fully specified likelihood. (It might not be efficient compared to the case when you have full information, but your results are less wrong than if your full specification is wrong.) Josef Mommy! I found a statistically significant penny. I'm rich. :) > > -- Phil > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From charlesr.harris at gmail.com Tue Feb 14 21:10:39 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 14 Feb 2012 19:10:39 -0700 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: References: <4F3AB8AA.2040402@molden.no> <4F3AC694.8080205@molden.no> <4F3AD08C.50906@eos.ubc.ca> Message-ID: On Tue, Feb 14, 2012 at 6:25 PM, wrote: > On Tue, Feb 14, 2012 at 4:22 PM, Phil Austin wrote: > > On 12-02-14 12:39 PM, Sturla Molden wrote: > >> On 14.02.2012 21:24, josef.pktd at gmail.com wrote: > >> > >> > Do you expect an argument? sounds a bit like http://andrewgelman.com/ > >> > > > > Coincidentally, this discussion: > > > http://andrewgelman.com/2012/02/adding-an-error-model-to-a-deterministic-model/ > > started when a civil engineering PhD posted a request for help. My > reading > > of the ensuing discussion of both posts is that there is still a lot of > > work to > > do in bridging statistics (bayesian or frequentist) and deterministic > > modeling > > of complex systems. > > I don't quite see why there should be anything deterministic (in the > sense of correctly described by a mathematical model) about the growth > of bacteria and the response of living tissue, (as there is nothing > deterministic in the behavior of the macro economy). In economics we > just add a noise variable (unexplained environmental or behavioral > shocks) everywhere. > > I thought these were exactly the kind of dynamic problems that Kalman > Filter (or it's nonlinear successors) were invented for. > > My main impression of the two articles and discussion is that being a > Bayesian is a lot of work if you need to have a fully specified prior > and likelihood, instead of just working with some semi-parametric > estimation method (like least squares) that still produces results > even if you don't have a fully specified likelihood. (It might not be > efficient compared to the case when you have full information, but > your results are less wrong than if your full specification is wrong.) > > Well, invented priors can be used to bias parametric results for political purposes. Thar's gold in them priors. So there is that ;) I read E. T. Jaynes early papers and his book and enjoyed them, but I think treating physical entropy by Bayesian methods was a bit much. I don't think think the thermodynamic properties of a system depend on the observers knowlege. I would say both methods have their place, just use the right one for the problem at hand. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Feb 14 22:49:30 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 14 Feb 2012 21:49:30 -0600 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: References: <4F3AB8AA.2040402@molden.no> <4F3AC694.8080205@molden.no> <4F3AD08C.50906@eos.ubc.ca> Message-ID: <2B129FA1-0A36-46D4-AA2B-0FF76702671A@continuum.io> > > Coincidentally, this discussion: > > http://andrewgelman.com/2012/02/adding-an-error-model-to-a-deterministic-model/ > > started when a civil engineering PhD posted a request for help. My reading > > of the ensuing discussion of both posts is that there is still a lot of > > work to > > do in bridging statistics (bayesian or frequentist) and deterministic > > modeling > > of complex systems. > > I don't quite see why there should be anything deterministic (in the > sense of correctly described by a mathematical model) about the growth > of bacteria and the response of living tissue, (as there is nothing > deterministic in the behavior of the macro economy). In economics we > just add a noise variable (unexplained environmental or behavioral > shocks) everywhere. > > I thought these were exactly the kind of dynamic problems that Kalman > Filter (or it's nonlinear successors) were invented for. > > My main impression of the two articles and discussion is that being a > Bayesian is a lot of work if you need to have a fully specified prior > and likelihood, instead of just working with some semi-parametric > estimation method (like least squares) that still produces results > even if you don't have a fully specified likelihood. (It might not be > efficient compared to the case when you have full information, but > your results are less wrong than if your full specification is wrong.) > > > Well, invented priors can be used to bias parametric results for political purposes. Thar's gold in them priors. So there is that ;) > > I read E. T. Jaynes early papers and his book and enjoyed them, but I think treating physical entropy by Bayesian methods was a bit much. I don't think think the thermodynamic properties of a system depend on the observers knowlege. I would say both methods have their place, just use the right one for the problem at hand. > It sounds like we will have to revisit your views there over drinks sometime. I think the whole point is that there is really no such thing as physical entropy. It's all just a property that you have to assign to a system if you want maximum reproducibility without constraining everything. That's the way I prefer to think about it at this point anyway ;-) -Travis > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Feb 14 23:15:04 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 14 Feb 2012 21:15:04 -0700 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <2B129FA1-0A36-46D4-AA2B-0FF76702671A@continuum.io> References: <4F3AB8AA.2040402@molden.no> <4F3AC694.8080205@molden.no> <4F3AD08C.50906@eos.ubc.ca> <2B129FA1-0A36-46D4-AA2B-0FF76702671A@continuum.io> Message-ID: On Tue, Feb 14, 2012 at 8:49 PM, Travis Oliphant wrote: > > > Coincidentally, this discussion: >> > >> http://andrewgelman.com/2012/02/adding-an-error-model-to-a-deterministic-model/ >> > started when a civil engineering PhD posted a request for help. My >> reading >> > of the ensuing discussion of both posts is that there is still a lot of >> > work to >> > do in bridging statistics (bayesian or frequentist) and deterministic >> > modeling >> > of complex systems. >> >> I don't quite see why there should be anything deterministic (in the >> sense of correctly described by a mathematical model) about the growth >> of bacteria and the response of living tissue, (as there is nothing >> deterministic in the behavior of the macro economy). In economics we >> just add a noise variable (unexplained environmental or behavioral >> shocks) everywhere. >> >> I thought these were exactly the kind of dynamic problems that Kalman >> Filter (or it's nonlinear successors) were invented for. >> >> My main impression of the two articles and discussion is that being a >> Bayesian is a lot of work if you need to have a fully specified prior >> and likelihood, instead of just working with some semi-parametric >> estimation method (like least squares) that still produces results >> even if you don't have a fully specified likelihood. (It might not be >> efficient compared to the case when you have full information, but >> your results are less wrong than if your full specification is wrong.) >> >> > Well, invented priors can be used to bias parametric results for political > purposes. Thar's gold in them priors. So there is that ;) > > I read E. T. Jaynes early papers and his book and enjoyed them, but I > think treating physical entropy by Bayesian methods was a bit much. I don't > think think the thermodynamic properties of a system depend on the > observers knowlege. I would say both methods have their place, just use the > right one for the problem at hand. > > > It sounds like we will have to revisit your views there over drinks > sometime. I think the whole point is that there is really no such thing > as physical entropy. It's all just a property that you have to assign to > a system if you want maximum reproducibility without constraining > everything. That's the way I prefer to think about it at this point > anyway ;-) > > Classically, it's an assumption about the behavior of the dynamical systems. It doesn't even need to be exactly so. But in any case, it is a dynamical problem, not a knowledge problem, and has physical affects that have nothing to do with the observer. Heat flows from hot to cold whatever you care to think. A watched pot on the stove *does* boil. Where things get interesting is if you start manipulating the system, start measuring things or put in a Maxwell's demon. Then there is an interplay. Deeper down, one might start asking questions about the physical representation of the knowledge in the observer. Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Wed Feb 15 03:21:10 2012 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 15 Feb 2012 09:21:10 +0100 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3AB8AA.2040402@molden.no> References: <4F3AB8AA.2040402@molden.no> Message-ID: <4F3B6AF6.70009@grinta.net> Hello, I'll hijack this thread to ask for advice. I'm a physicist and, as you may expect, my education in statistics is mostly in Frequentists methods. However, I always had an interest in Bayesian methods, as those seems to solve in much more natural ways the problems that arise in complex data analysis. I recently started to read "Data Analysis, A Bayesian Tutorial" by D.S. Silva (currently reading chapter 4, unfortunately real work is always interfering) and I really like the approach and the straight forward manner in which the theory builds up. However, I feel that the Bayesian approach, is much more difficult to translate to practical methods I can implement, but I may be biased by the long term exposition to the "recipe based" Frequentist approach. Can someone suggest me some resources (documentation or code) where some practical approaches to Bayesian analysis are taught? Thank you. Cheers, -- Daniele From johann.cohentanugi at gmail.com Wed Feb 15 03:33:30 2012 From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi) Date: Wed, 15 Feb 2012 09:33:30 +0100 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3B6AF6.70009@grinta.net> References: <4F3AB8AA.2040402@molden.no> <4F3B6AF6.70009@grinta.net> Message-ID: <4F3B6DDA.7090905@gmail.com> Hi Daniele, what domain of physics? Do you know of the work by Guido D'Agostini? He is a particle physics by trade I believe. If you are more into astrophysics, check Tom Loredo. Johann On 02/15/2012 09:21 AM, Daniele Nicolodi wrote: > Hello, I'll hijack this thread to ask for advice. > > I'm a physicist and, as you may expect, my education in statistics is > mostly in Frequentists methods. However, I always had an interest in > Bayesian methods, as those seems to solve in much more natural ways the > problems that arise in complex data analysis. > > I recently started to read "Data Analysis, A Bayesian Tutorial" by D.S. > Silva (currently reading chapter 4, unfortunately real work is always > interfering) and I really like the approach and the straight forward > manner in which the theory builds up. > > However, I feel that the Bayesian approach, is much more difficult to > translate to practical methods I can implement, but I may be biased by > the long term exposition to the "recipe based" Frequentist approach. > > Can someone suggest me some resources (documentation or code) where some > practical approaches to Bayesian analysis are taught? > > Thank you. Cheers, From dgorman at berkeley.edu Wed Feb 15 03:44:13 2012 From: dgorman at berkeley.edu (Dylan Gorman) Date: Wed, 15 Feb 2012 00:44:13 -0800 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3B6AF6.70009@grinta.net> References: <4F3AB8AA.2040402@molden.no> <4F3B6AF6.70009@grinta.net> Message-ID: Well there's always the classic by Jaynes, of Jaynes-Cummings model fame. (At least, famous to me since I do quantum optics.) "Probability Theory: The Logic Of Science." On Feb 15, 2012, at 12:21 AM, Daniele Nicolodi wrote: > Hello, I'll hijack this thread to ask for advice. > > I'm a physicist and, as you may expect, my education in statistics is > mostly in Frequentists methods. However, I always had an interest in > Bayesian methods, as those seems to solve in much more natural ways the > problems that arise in complex data analysis. > > I recently started to read "Data Analysis, A Bayesian Tutorial" by D.S. > Silva (currently reading chapter 4, unfortunately real work is always > interfering) and I really like the approach and the straight forward > manner in which the theory builds up. > > However, I feel that the Bayesian approach, is much more difficult to > translate to practical methods I can implement, but I may be biased by > the long term exposition to the "recipe based" Frequentist approach. > > Can someone suggest me some resources (documentation or code) where some > practical approaches to Bayesian analysis are taught? > > Thank you. Cheers, > -- > Daniele > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sgarcia at olfac.univ-lyon1.fr Wed Feb 15 03:58:33 2012 From: sgarcia at olfac.univ-lyon1.fr (Samuel Garcia) Date: Wed, 15 Feb 2012 09:58:33 +0100 Subject: [SciPy-User] [ANN] release of Neo 0.2.0 In-Reply-To: References: <4F38D337.7000703@olfac.univ-lyon1.fr> Message-ID: <4F3B73B9.8050408@olfac.univ-lyon1.fr> Nitime is oriented for time series of neuro imaging. Neo offer a complete data model center around electrophysiology (in vivo or simulations) with more objects than nitime (for this fields of course): RecodingChannelGroup +RecodingChannel + Segment + Block +Unit+... Samuel Le 14/02/2012 16:15, Larry Eisenman a ?crit : > Samuel Garcia olfac.univ-lyon1.fr> writes: > >> >> Dear scipy list, >> We are proud to announce the 0.2.0 release of Neo, a Python library for > working with >> electrophysiology data, whether from biological experiments or from >> simulations. > .... > > How does this compare/relate to the Nitime module of the Nipy project > (http://nipy.sourceforge.net/nitime/)? > > Larry > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Samuel Garcia Lyon Neuroscience CNRS - UMR5292 - INSERM U1028 - Universite Claude Bernard LYON 1 Equipe R et D 50, avenue Tony Garnier 69366 LYON Cedex 07 FRANCE T?l : 04 37 28 74 24 Fax : 04 37 28 76 01 http://olfac.univ-lyon1.fr/unite/equipe-07/ http://neuralensemble.org/trac/OpenElectrophy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From emanuele at relativita.com Wed Feb 15 04:00:12 2012 From: emanuele at relativita.com (Emanuele Olivetti) Date: Wed, 15 Feb 2012 10:00:12 +0100 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3B6AF6.70009@grinta.net> References: <4F3AB8AA.2040402@molden.no> <4F3B6AF6.70009@grinta.net> Message-ID: <4F3B741C.2030901@relativita.com> On 02/15/2012 09:21 AM, Daniele Nicolodi wrote: > I recently started to read "Data Analysis, A Bayesian Tutorial" by D.S. > Silva (currently reading chapter 4, unfortunately real work is always > interfering) and I really like the approach and the straight forward > manner in which the theory builds up. > Sivia&Skilling's book is very good. A similar and good one is Gregory's "Bayesian Logical Data Analysis for the Physical Sciences". > However, I feel that the Bayesian approach, is much more difficult to > translate to practical methods I can implement, but I may be biased by > the long term exposition to the "recipe based" Frequentist approach. > My opinion is that the Bayesian approach is so much about modelling the specific system under analysis that is I don't believe there is a shortcut or a book of recipes that fits every needs. In other words every practical application usually has its own peculiarities that frequently leads to a custom solution. Having said that, in my experience I frequently relied upon hierarchical/multilevel modelling that can be approached with Monte Carlo techniques. In that case the math can be simple (with caveats) and the hard part is done by the Monte Carlo sampler. > Can someone suggest me some resources (documentation or code) where some > practical approaches to Bayesian analysis are taught? > If you come from a frequentist mindset (t-test, ANOVA, etc.) you might find some quick and interesting thing in this book: http://www.ejwagenmakers.com/BayesCourse/BayesBook.html "A Practical Course in Bayesian Graphical Modeling", by Lee and Wagenmakers. In this book you will find a quick approach to the hierarchical/multilevel modelling mentioned above. The book illustrates examples using winBUGS - a sampler that I find puzzling and that I don't like. Luckily you can do the same with the very good PyMC http://code.google.com/p/pymc/ or by writing you own sampler. Best, Emanuele From njs at pobox.com Wed Feb 15 04:57:48 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 15 Feb 2012 09:57:48 +0000 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3AB8AA.2040402@molden.no> References: <4F3AB8AA.2040402@molden.no> Message-ID: I don't understand why this is always framed as a "versus" debate. "Bayesian methods" are the math for figuring out what to believe; "frequentist methods" are the math for figuring if you're fooling yourself. It makes perfect sense for engineers and in house estimates to use the former and the FDA and scientists the latter. Different methods answer different questions. I share everyone's frustration with ignorant people misinterpreting frequentist results, but contra point 18, I have begun to meet people doing the same to Bayesian methods, and I the tools are getting more accessible all the time. On Feb 14, 2012 7:40 PM, "Sturla Molden" wrote: > > After having worked with applied statistics for ~15 years, I have > reached this conclusion... ;-) > > > Sturla's 20 propositions on Bayesian vs. classical statistics: > ============================================================== > > 1. For simple data, a figure is sufficient, nobody really cares. > > 2. For dummy problems with known facit, Bayesian methods tend to be the > more accurate. > > 3. Bayesian methods include prior knowlege. A horse of 400 g is a priori > less likely than a horse of 400 kg. Frequentists say this is too > subjective. > > 4. Bayesian methods are easier to interpret. Few understand a > frequenctist confidence interval, albeit everybody they think they do. > > 5. Hypothesis testing: Bayesians answer the question we ask. Freuentists > don't. > > 6. Economists investing their own money are bayesians. > > 7. Economists investing your money are frequentists. > > 8. For basic medical research, nobody cares. > > 9. Drug trials: For getting an FDA application approved, frequentists > often yield a more 'significant result'. > > 10. Drug trials: For in-hose liability estimates, Bayesian methods are > the safer. > > 11. Frequentists can always get more significant results by "sampling > more data". > > 12. Frequentists don't care about stopping rules, even though they should. > > 13. Bayesians don't care about stopping rules bacause they don't have to. > > 14. "Significant" does not mean "important". Any tiny difference can be > made statistically significant. > > 15. For interpreting clinical lab tests, Bayesian methods prevail, e.g. > predictive values. > > 16. Engineers who know their mathematics use Bayesian methods. > > 17. Social scientists who don't know their mathematics are frequentists. > > 18. SPSS, Excel, Minitab, and SAS make it easy to be an ignorant > frequentist. > > 19. No tool make it easy to be an ignorant bayesian. > > 20. Competent analysts use R, Fortran, Matlab or Python. > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lou_boog2000 at yahoo.com Wed Feb 15 09:37:41 2012 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Wed, 15 Feb 2012 06:37:41 -0800 (PST) Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3B6AF6.70009@grinta.net> References: <4F3AB8AA.2040402@molden.no> <4F3B6AF6.70009@grinta.net> Message-ID: <1329316661.30949.YahooMailNeo@web34404.mail.mud.yahoo.com> From: Daniele Nicolodi To: scipy-user at scipy.org Sent: Wednesday, February 15, 2012 3:21 AM Subject: Re: [SciPy-User] [OT] Bayesian vs. frequentist Hello, I'll hijack this thread to ask for advice. I'm a physicist and, as you may expect, my education in statistics is mostly in Frequentists methods. However, I always had an interest in Bayesian methods, as those seems to solve in much more natural ways the problems that arise in complex data analysis. I recently started to read "Data Analysis, A Bayesian Tutorial" by D.S. Silva (currently reading chapter 4, unfortunately real work is always interfering) and I really like the approach and the straight forward manner in which the theory builds up. However, I feel that the Bayesian approach, is much more difficult to translate to practical methods I can implement, but I may be biased by the long term exposition to the "recipe based" Frequentist approach. Can someone suggest me some resources (documentation or code) where some practical approaches to Bayesian analysis are taught? Thank you. Cheers, -- Daniele _______________________________________________ SciPy-User mailing list I'm also a physicist and just getting into all this. ?Silva's book is good. ?Here are two others I found that look good and readable. ?I have not read either all the way, but they are worth examining. ?You should also (after digesting some standard Bayesian statistics) examine the newer latent Dirichlet methods which look pretty powerful and seem to have a better way to handle and generate priors. ?Again, I'm a novice here, but these look like good avenues for a scientist trying to learn Bayesian statistics. (1) Udo von Toussaint, "Bayesian inference in physics", REVIEWS OF MODERN PHYSICS, VOLUME 83, JULY?SEPTEMBER 2011 (2)?Daniela Calvetti and Erkki Somersalo, Introduction to Bayesian scientific computing (Springer, 2007) It's a good topic even if it's OT -- provided everyone remains civil. ?:-)? ? -- Lou Pecora, my views are my own. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From afraser at lanl.gov Wed Feb 15 10:01:06 2012 From: afraser at lanl.gov (Fraser, Andrew McLeod) Date: Wed, 15 Feb 2012 15:01:06 +0000 Subject: [SciPy-User] Bayesian Data Assimilation in Python Message-ID: Lou, My book "Hidden Markov Models and Dynamical Systems" published by SIAM is about data assimilation in the simplest contexts. The approach is Bayesian and all of the examples are in python. (Now that numpy and scipy exist, I plan to update/improve the code to use them.) Andrew M. Fraser -------------- next part -------------- An HTML attachment was scrubbed... URL: From pecora at anvil.nrl.navy.mil Wed Feb 15 10:42:04 2012 From: pecora at anvil.nrl.navy.mil (Lou Pecora) Date: Wed, 15 Feb 2012 10:42:04 -0500 Subject: [SciPy-User] Bayesian Data Assimilation in Python In-Reply-To: References: Message-ID: <4F3BD24C.1040107@anvil.nrl.navy.mil> On 2/15/12 10:01 AM, Fraser, Andrew McLeod wrote: > Lou, > > My book "Hidden Markov Models and Dynamical Systems" published by SIAM > is about data assimilation in the simplest contexts. The approach is > Bayesian and all of the examples are in python. (Now that numpy and > scipy exist, I plan to update/improve the code to use them.) > > Andrew M. Fraser > Thanks, Andy. Good to know. I am piling up Bayesian books. I hope I can get to read them. :-) -- Lou Pecora Code 6362 Naval Research Laboratory Washington, DC 20375, US ph: +202-767-6002 FAX: 202-767-1697 email: pecora at anvil.nrl.navy.mil -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.ratcliff at gmail.com Wed Feb 15 11:11:37 2012 From: william.ratcliff at gmail.com (william ratcliff) Date: Wed, 15 Feb 2012 09:11:37 -0700 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <1329316661.30949.YahooMailNeo@web34404.mail.mud.yahoo.com> References: <4F3AB8AA.2040402@molden.no> <4F3B6AF6.70009@grinta.net> <1329316661.30949.YahooMailNeo@web34404.mail.mud.yahoo.com> Message-ID: I'm another physicist and find Silva's book to be good. One of the things that I've used is maximum entropy in trying to reconstruct magnetization densities from neutron scattering data, rather than Fourier transforms (sad problems with termination effects...). I'd also like to use it more in model selection--for example, say you have a data set that you can fit to 4 gaussians, or 2--even if you get a "better" (lower chi^2), is it significant? BIQ can be useful... William On Wed, Feb 15, 2012 at 7:37 AM, Lou Pecora wrote: > *From:* Daniele Nicolodi > *To:* scipy-user at scipy.org > *Sent:* Wednesday, February 15, 2012 3:21 AM > *Subject:* Re: [SciPy-User] [OT] Bayesian vs. frequentist > > Hello, I'll hijack this thread to ask for advice. > > I'm a physicist and, as you may expect, my education in statistics is > mostly in Frequentists methods. However, I always had an interest in > Bayesian methods, as those seems to solve in much more natural ways the > problems that arise in complex data analysis. > > I recently started to read "Data Analysis, A Bayesian Tutorial" by D.S. > Silva (currently reading chapter 4, unfortunately real work is always > interfering) and I really like the approach and the straight forward > manner in which the theory builds up. > > However, I feel that the Bayesian approach, is much more difficult to > translate to practical methods I can implement, but I may be biased by > the long term exposition to the "recipe based" Frequentist approach. > > Can someone suggest me some resources (documentation or code) where some > practical approaches to Bayesian analysis are taught? > > Thank you. Cheers, > -- > Daniele > _______________________________________________ > SciPy-User mailing list > > > I'm also a physicist and just getting into all this. Silva's book is > good. Here are two others I found that look good and readable. I have not > read either all the way, but they are worth examining. You should also > (after digesting some standard Bayesian statistics) examine the newer > latent Dirichlet methods which look pretty powerful and seem to have a > better way to handle and generate priors. Again, I'm a novice here, but > these look like good avenues for a scientist trying to learn Bayesian > statistics. > > (1) Udo von Toussaint, "Bayesian inference in physics", REVIEWS OF MODERN > PHYSICS, VOLUME 83, JULY?SEPTEMBER 2011 > > (2) Daniela Calvetti and Erkki Somersalo, Introduction to Bayesian > scientific computing (Springer, 2007) > > > > It's a good topic even if it's OT -- provided everyone remains civil. :-) > > > > -- Lou Pecora, my views are my own. > ------------------------------ > > ** > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From k-assem84 at hotmail.com Tue Feb 14 10:25:43 2012 From: k-assem84 at hotmail.com (suzana8447) Date: Tue, 14 Feb 2012 07:25:43 -0800 (PST) Subject: [SciPy-User] [SciPy-user] scipy: integration Message-ID: <33322670.post@talk.nabble.com> Hello every body, I am a new user of scipy and I am facing some troubles of performing numerical integration. Suppose that we have a function called f(x,a,b) that we are going to integrate it from 0 to c. Note that a, b and c are some arrays. For example a=array([1., 0.5,3.]) , b=array([0.,1.,4.)] c= array([..,..,...]) By this way, I would expect that the program returns an array whereby each value of the array represents the integral at each term of the arrays a,b and c. Thanks in advance. Best regards. -- View this message in context: http://old.nabble.com/scipy%3A-integration-tp33322670p33322670.html Sent from the Scipy-User mailing list archive at Nabble.com. From mkpaustin at gmail.com Tue Feb 14 16:17:47 2012 From: mkpaustin at gmail.com (Phil Austin) Date: Tue, 14 Feb 2012 13:17:47 -0800 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3AC694.8080205@molden.no> References: <4F3AB8AA.2040402@molden.no> <4F3AC694.8080205@molden.no> Message-ID: <4F3ACF7B.2080204@gmail.com> On 12-02-14 12:39 PM, Sturla Molden wrote: > On 14.02.2012 21:24, josef.pktd at gmail.com wrote: > > > Do you expect an argument? sounds a bit like http://andrewgelman.com/ > > No I don't. I am just tired of explaining that "significant p-value" > does not imply "very important effect". Sorry for spamming the list with > my rant. Coincidentally, this discussion: http://andrewgelman.com/2012/02/adding-an-error-model-to-a-deterministic-model/ started when a civil engineering PhD posted a request for help. My reading of the ensuing discussion of both posts is that there is still a lot of work to do in bridging statistics (bayesian or frequentist) and deterministic modeling of complex systems. -- Phil From elofgren at email.unc.edu Wed Feb 15 03:28:15 2012 From: elofgren at email.unc.edu (Lofgren, Eric) Date: Wed, 15 Feb 2012 08:28:15 +0000 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: References: Message-ID: <5913DB1B-B807-4F19-877C-9B9742F4C5B9@unc.edu> Daniele, You're not wrong. I'm in a field that has (partially) embraced Bayesian methods, and one of the challenges in getting it adopted and used in general practice is that it is *much* harder to implement at times. It's not just a matter of the code itself - a tremendous amount of work has to go into obtaining the priors in order for them to be at all meaningful. You might want to consider popping over to Stack Overflow's sister site, Cross Validated (http://stats.stackexchange.com). I've found it to be an extremely helpful resource for asking questions both on a theory level and a practical applications/coding level. Eric On Feb 15, 2012, at 3:17 AM, > > Message: 5 > Date: Wed, 15 Feb 2012 09:21:10 +0100 > From: Daniele Nicolodi > Subject: Re: [SciPy-User] [OT] Bayesian vs. frequentist > To: scipy-user at scipy.org > Message-ID: <4F3B6AF6.70009 at grinta.net> > Content-Type: text/plain; charset=ISO-8859-1 > > Hello, I'll hijack this thread to ask for advice. > > I'm a physicist and, as you may expect, my education in statistics is > mostly in Frequentists methods. However, I always had an interest in > Bayesian methods, as those seems to solve in much more natural ways the > problems that arise in complex data analysis. > > I recently started to read "Data Analysis, A Bayesian Tutorial" by D.S. > Silva (currently reading chapter 4, unfortunately real work is always > interfering) and I really like the approach and the straight forward > manner in which the theory builds up. > > However, I feel that the Bayesian approach, is much more difficult to > translate to practical methods I can implement, but I may be biased by > the long term exposition to the "recipe based" Frequentist approach. > > Can someone suggest me some resources (documentation or code) where some > practical approaches to Bayesian analysis are taught? > > Thank you. Cheers, > -- > Daniele From k-assem84 at hotmail.com Wed Feb 15 08:36:59 2012 From: k-assem84 at hotmail.com (suzana8447) Date: Wed, 15 Feb 2012 05:36:59 -0800 (PST) Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: References: <33301423.post@talk.nabble.com> Message-ID: <33328834.post@talk.nabble.com> Thanks all for your help. What I have understood is that I get what so called the cov_x from least square root fit and then multipy this matrix by the error variance. I have two more questions. 1) What is meant by the error variance? How one can extract it? 2) Do you mean by ||err||= func-data? Thanks in advance. Charles R Harris wrote: > > On Mon, Feb 13, 2012 at 8:56 AM, wrote: > >> On Sat, Feb 11, 2012 at 7:39 PM, Kevin Gullikson >> wrote: >> > Use full_output=True when you call leastq, and you will get a matrix >> (among >> > other things). If you multiply that matrix by the standard deviation of >> the >> > residuals, it will be the covariance matrix. >> >> As Charles pointed out, multiply by the error variance not the >> standard deviation. Docstring is wrong in this. >> >> > Even more precisely, multiply by ||err||^2/(n - dof), since it is possible > that the error has an offset unless the model can perfectly fit a > constant. > If this actually makes a difference, the model is inadequate, but the > variance estimate might be useful if you are using something like the > Akaike information criterion to choose the number of parameters. > > > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/Covariance-matrix-tp33301423p33328834.html Sent from the Scipy-User mailing list archive at Nabble.com. From WILLIAM.GRIFFIN at asu.edu Wed Feb 15 11:35:09 2012 From: WILLIAM.GRIFFIN at asu.edu (William Griffin) Date: Wed, 15 Feb 2012 09:35:09 -0700 Subject: [SciPy-User] Bayesian Data Assimilation in Python In-Reply-To: References: Message-ID: Excellent book, by the way. -- Regards, Bill William A. Griffin, Ph.D. Center for Social Dynamics And Complexity President, CSSSA http://computationalsocialscience.org Arizona State University Tempe, AZ 85287-4804 william.griffin at asu.edu http://www.asu.edu/clas/csdc/bios/wgriffin.html On Wed, Feb 15, 2012 at 8:01 AM, Fraser, Andrew McLeod wrote: > Lou, > > My book "Hidden Markov Models and Dynamical Systems" published by SIAM is > about data assimilation in the simplest contexts. The approach is Bayesian > and all of the examples are in python. (Now that numpy and scipy exist, I > plan to update/improve the code to use them.) > > Andrew M. Fraser > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Feb 15 11:42:30 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 15 Feb 2012 11:42:30 -0500 Subject: [SciPy-User] [SciPy-user] Covariance matrix In-Reply-To: <33328834.post@talk.nabble.com> References: <33301423.post@talk.nabble.com> <33328834.post@talk.nabble.com> Message-ID: On Wed, Feb 15, 2012 at 8:36 AM, suzana8447 wrote: > > Thanks all for your ?help. > > What I have understood is that I get what so called the cov_x from least > square root fit and then multipy this matrix by the error variance. > > I have two more questions. > > 1) What is meant by the ?error variance? How one can extract it? > > 2) Do you mean by ||err||= func-data? (func-data).sum() / (n-k) squared sum of it an divided by number of observations minus number of parameters, as Chuck mentioned source of curve_fit is useful https://github.com/scipy/scipy/blob/master/scipy/optimize/minpack.py#L441 IIRC, infodict 'fvec' returns the squared error sum Josef > > Thanks in advance. > > Charles R Harris wrote: >> >> On Mon, Feb 13, 2012 at 8:56 AM, wrote: >> >>> On Sat, Feb 11, 2012 at 7:39 PM, Kevin Gullikson >>> wrote: >>> > Use full_output=True when you call leastq, and you will get a matrix >>> (among >>> > other things). If you multiply that matrix by the standard deviation of >>> the >>> > residuals, it will be the covariance matrix. >>> >>> As Charles pointed out, multiply by the error variance not the >>> standard deviation. Docstring is wrong in this. >>> >>> >> Even more precisely, multiply by ||err||^2/(n - dof), since it is possible >> that the error has an offset unless the model can perfectly fit a >> constant. >> If this actually makes a difference, the model is inadequate, but the >> variance estimate might be useful if you are using something like the >> Akaike information criterion to choose the number of parameters. >> >> >> >> Chuck >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > -- > View this message in context: http://old.nabble.com/Covariance-matrix-tp33301423p33328834.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From johnl at cs.wisc.edu Wed Feb 15 11:47:53 2012 From: johnl at cs.wisc.edu (J. David Lee) Date: Wed, 15 Feb 2012 10:47:53 -0600 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <1329316661.30949.YahooMailNeo@web34404.mail.mud.yahoo.com> References: <4F3AB8AA.2040402@molden.no> <4F3B6AF6.70009@grinta.net> <1329316661.30949.YahooMailNeo@web34404.mail.mud.yahoo.com> Message-ID: <4F3BE1B9.2060601@cs.wisc.edu> On 02/15/2012 08:37 AM, Lou Pecora wrote: > *From:* Daniele Nicolodi > *To:* scipy-user at scipy.org > *Sent:* Wednesday, February 15, 2012 3:21 AM > *Subject:* Re: [SciPy-User] [OT] Bayesian vs. frequentist > > Hello, I'll hijack this thread to ask for advice. > > I'm a physicist and, as you may expect, my education in statistics is > mostly in Frequentists methods. However, I always had an interest in > Bayesian methods, as those seems to solve in much more natural ways the > problems that arise in complex data analysis. > > I recently started to read "Data Analysis, A Bayesian Tutorial" by D.S. > Silva (currently reading chapter 4, unfortunately real work is always > interfering) and I really like the approach and the straight forward > manner in which the theory builds up. > > However, I feel that the Bayesian approach, is much more difficult to > translate to practical methods I can implement, but I may be biased by > the long term exposition to the "recipe based" Frequentist approach. > > Can someone suggest me some resources (documentation or code) where some > practical approaches to Bayesian analysis are taught? > > Thank you. Cheers, > -- > Daniele > _______________________________________________ > SciPy-User mailing list > > > I'm also a physicist and just getting into all this. Silva's book is > good. Here are two others I found that look good and readable. I > have not read either all the way, but they are worth examining. You > should also (after digesting some standard Bayesian statistics) > examine the newer latent Dirichlet methods which look pretty powerful > and seem to have a better way to handle and generate priors. Again, > I'm a novice here, but these look like good avenues for a scientist > trying to learn Bayesian statistics. > > (1) Udo von Toussaint, "Bayesian inference in physics", REVIEWS OF > MODERN PHYSICS, VOLUME 83, JULY?SEPTEMBER 2011 > > (2) Daniela Calvetti and Erkki Somersalo, Introduction to Bayesian > scientific computing (Springer, 2007) > > > > It's a good topic even if it's OT -- provided everyone remains civil. > :-) > > > -- Lou Pecora, my views are my own. I would also recommend the Toussaint paper. It contains several case studies from various areas of physics that you might find interesting. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Feb 15 11:52:13 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 15 Feb 2012 11:52:13 -0500 Subject: [SciPy-User] [SciPy-user] scipy: integration In-Reply-To: <33322670.post@talk.nabble.com> References: <33322670.post@talk.nabble.com> Message-ID: On Tue, Feb 14, 2012 at 10:25 AM, suzana8447 wrote: > > Hello every body, > > I am a new user of scipy and I am facing some troubles of performing > numerical integration. > > Suppose that we have a function called f(x,a,b) that we are going to > integrate it from 0 to c. > > Note that a, b and c are some arrays. > > For example a=array([1., 0.5,3.]) , b=array([0.,1.,4.)] ?c= > array([..,..,...]) > > By this way, I would expect that the program returns an array whereby each > value of the array represents the integral at each term of the arrays a,b > and c. AFAIK None of the integrators for functions are vectorized. So this would always need a loop. For lower precision, the integrators using samples have an axis argument if you build an array for different values of the parameters a and b. Only cumtrapz (or using ode) can integrate for different limits simultaneously. http://docs.scipy.org/doc/scipy/reference/integrate.html#integrating-functions-given-fixed-samples Which version to use depends on your requirements. Josef > > Thanks in advance. > > Best regards. > > -- > View this message in context: http://old.nabble.com/scipy%3A-integration-tp33322670p33322670.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From fccoelho at gmail.com Wed Feb 15 13:52:28 2012 From: fccoelho at gmail.com (Flavio Coelho) Date: Wed, 15 Feb 2012 16:52:28 -0200 Subject: [SciPy-User] Bayesian Data Assimilation in Python In-Reply-To: References: Message-ID: What dou you mean by "Now that numpy and scipy exist"? They surely already existed when your book was published in 2009.... [?] Congrats, looks like an interesting book! On Wed, Feb 15, 2012 at 13:01, Fraser, Andrew McLeod wrote: > Lou, > > My book "Hidden Markov Models and Dynamical Systems" published by SIAM is > about data assimilation in the simplest contexts. The approach is Bayesian > and all of the examples are in python. (Now that numpy and scipy exist, I > plan to update/improve the code to use them.) > > Andrew M. Fraser > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- Fl?vio Code?o Coelho ================ +55(21) 3799-5567 Professor Escola de Matem?tica Aplicada Funda??o Get?lio Vargas Rio de Janeiro - RJ Brasil -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 33E.png Type: image/png Size: 626 bytes Desc: not available URL: From afraser at lanl.gov Wed Feb 15 15:41:47 2012 From: afraser at lanl.gov (Fraser, Andrew McLeod) Date: Wed, 15 Feb 2012 20:41:47 +0000 Subject: [SciPy-User] Bayesian Data Assimilation in Python In-Reply-To: References: , Message-ID: When I started the book in 2002, I chose python. I shifted from C and Octave. I submitted my final draft to SIAM in 2007 and they published it in 2008. In the intervening years I used NUMERIC, Numarray, SWIG, gnuplot ... The tools and hardware get better continuously. As I was working on the book, some drafts took more than a day to build. When I get some time, I will rewrite the code to use scipy sparse matrices, cython, and matplotlib. It should be prettier and faster. ________________________________ From: scipy-user-bounces at scipy.org [scipy-user-bounces at scipy.org] on behalf of Flavio Coelho [fccoelho at gmail.com] Sent: Wednesday, February 15, 2012 11:52 AM To: SciPy Users List Subject: Re: [SciPy-User] Bayesian Data Assimilation in Python What dou you mean by "Now that numpy and scipy exist"? They surely already existed when your book was published in 2009.... [cid:gtalk.33E at goomoji.gmail] Congrats, looks like an interesting book! On Wed, Feb 15, 2012 at 13:01, Fraser, Andrew McLeod > wrote: Lou, My book "Hidden Markov Models and Dynamical Systems" published by SIAM is about data assimilation in the simplest contexts. The approach is Bayesian and all of the examples are in python. (Now that numpy and scipy exist, I plan to update/improve the code to use them.) Andrew M. Fraser _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -- Fl?vio Code?o Coelho ================ +55(21) 3799-5567 Professor Escola de Matem?tica Aplicada Funda??o Get?lio Vargas Rio de Janeiro - RJ Brasil -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 33E.png Type: image/png Size: 626 bytes Desc: 33E.png URL: From lanceboyle at qwest.net Thu Feb 16 01:56:34 2012 From: lanceboyle at qwest.net (Jerry) Date: Wed, 15 Feb 2012 23:56:34 -0700 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3AB8AA.2040402@molden.no> References: <4F3AB8AA.2040402@molden.no> Message-ID: Well, this is an interesting discussion. On a lighter but probably no less interesting note, there is a new book by Sharon Bertsch McGrayne, "The Theory That Would Not Die -- How Bayes' rule cracked the enigma code, hunted down Russian submarines, & emerged triumphant from two centuries of controversy." It is an amazingly detailed history of Bayes' Rule (did you know that it really should be called Laplace's Rule?) often vis-?-vis frequentism. Jerry From cournape at gmail.com Thu Feb 16 03:16:43 2012 From: cournape at gmail.com (David Cournapeau) Date: Thu, 16 Feb 2012 08:16:43 +0000 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: References: <4F3AB8AA.2040402@molden.no> Message-ID: Le 15 f?vr. 2012 09:57, "Nathaniel Smith" a ?crit : > > I don't understand why this is always framed as a "versus" debate. "Bayesian methods" are the math for figuring out what to believe; "frequentist methods" are the math for figuring if you're fooling yourself. It makes perfect sense for engineers and in house estimates to use the former and the FDA and scientists the latter. Different methods answer different questions. I believe the vs is at least partially justified, in the sense that they are fundamentally incompatible. This is the only mathematical field I am aware of where you have to "competing" frameworks. But a same person can certainly use either for a particular analysis. To be contrarian, there is a somehow famous paper from Judea Pearl (the father of Bayesian networks), "Bayesianism and causality, or, why I am only half-bayesian". The conversations between Gelman and Wasserman are also helpful to show strenghts/weaknesses of each approach. My own conclusion is that I still don't understand much about statistics. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From fccoelho at gmail.com Thu Feb 16 06:07:03 2012 From: fccoelho at gmail.com (Flavio Coelho) Date: Thu, 16 Feb 2012 09:07:03 -0200 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: References: <4F3AB8AA.2040402@molden.no> Message-ID: Thanks for the tip about the book, I just bought it! cheers, Fl?vo On Thu, Feb 16, 2012 at 04:56, Jerry wrote: > Well, this is an interesting discussion. On a lighter but probably no less > interesting note, there is a new book by Sharon Bertsch McGrayne, "The > Theory That Would Not Die -- How Bayes' rule cracked the enigma code, > hunted down Russian submarines, & emerged triumphant from two centuries of > controversy." It is an amazingly detailed history of Bayes' Rule (did you > know that it really should be called Laplace's Rule?) often vis-?-vis > frequentism. > > Jerry > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Fl?vio Code?o Coelho ================ +55(21) 3799-5567 Professor Escola de Matem?tica Aplicada Funda??o Get?lio Vargas Rio de Janeiro - RJ Brasil -------------- next part -------------- An HTML attachment was scrubbed... URL: From arokem at gmail.com Wed Feb 15 17:00:56 2012 From: arokem at gmail.com (Ariel Rokem) Date: Wed, 15 Feb 2012 14:00:56 -0800 Subject: [SciPy-User] [ANN] release of Neo 0.2.0 In-Reply-To: <4F3B73B9.8050408@olfac.univ-lyon1.fr> References: <4F38D337.7000703@olfac.univ-lyon1.fr> <4F3B73B9.8050408@olfac.univ-lyon1.fr> Message-ID: Hi Samuel, Congratulations on the release! Concerning the comparison with nitime: At the moment, support for reading data from files in nitime is only implemented for neuroimaging data (using nibabel as an optional dependency, only for this bit), but the core classes in nitime can be used for any kind of time-series data (including data from single-cell recordings: http://nipy.sourceforge.net/nitime/examples/grasshopper.html), assuming the data has already somehow been read into numpy arrays. In fact, it would be great to have io functions in nitime which read data using neo and initialize nitime.TimeSeries objects for further analysis. Cheers, Ariel On Wed, Feb 15, 2012 at 12:58 AM, Samuel Garcia wrote: > Nitime is oriented for time series of neuro imaging. > > Neo offer a complete data model center around electrophysiology (in vivo > or simulations) with more objects than nitime (for this fields of course): > RecodingChannelGroup +RecodingChannel + Segment + Block +Unit+... > > Samuel > > > > > > Le 14/02/2012 16:15, Larry Eisenman a ?crit : > > Samuel Garcia olfac.univ-lyon1.fr> writes: > > > >> > >> Dear scipy list, > >> We are proud to announce the 0.2.0 release of Neo, a Python > library for > > working with > >> electrophysiology data, whether from biological experiments or from > >> simulations. > > .... > > > > How does this compare/relate to the Nitime module of the Nipy project > > (http://nipy.sourceforge.net/nitime/)? > > > > Larry > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Samuel Garcia > Lyon Neuroscience > CNRS - UMR5292 - INSERM U1028 - Universite Claude Bernard LYON 1 > Equipe R et D > 50, avenue Tony Garnier > 69366 LYON Cedex 07 > FRANCE > T?l : 04 37 28 74 24 > Fax : 04 37 28 76 01 > http://olfac.univ-lyon1.fr/unite/equipe-07/ > http://neuralensemble.org/trac/OpenElectrophy > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Thu Feb 16 10:22:58 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Thu, 16 Feb 2012 16:22:58 +0100 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: References: <4F3AB8AA.2040402@molden.no> Message-ID: +1. I added the book to my wish list. I also enjoy(ed) the discussion. Nicky On 16 February 2012 12:07, Flavio Coelho wrote: > Thanks for the tip about the book, I just bought it! > > cheers, > > Fl?vo > > On Thu, Feb 16, 2012 at 04:56, Jerry wrote: >> >> Well, this is an interesting discussion. On a lighter but probably no less >> interesting note, there is a new book by Sharon Bertsch McGrayne, "The >> Theory That Would Not Die -- How Bayes' rule cracked the enigma code, hunted >> down Russian submarines, & emerged triumphant from two centuries of >> controversy." It is an amazingly detailed history of Bayes' Rule (did you >> know that it really should be called Laplace's Rule?) often vis-?-vis >> frequentism. >> >> Jerry >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > -- > Fl?vio Code?o Coelho > ================ > +55(21) 3799-5567 > Professor > Escola de Matem?tica Aplicada > Funda??o Get?lio Vargas > Rio de Janeiro - RJ > Brasil > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ljmamoreira at gmail.com Thu Feb 16 12:49:41 2012 From: ljmamoreira at gmail.com (Jose Amoreira) Date: Thu, 16 Feb 2012 17:49:41 +0000 Subject: [SciPy-User] How to use module parameters in dimension() Message-ID: <1595876.WsfY7zjZfd@mu.site> Hello I have a module that defines the dimension of an array as a parameter, and a subroutine in that module that computes and returns that array dimension: 1 module tests 2 implicit none 3 integer, parameter:: n=3 !Dimension of arrays 4 5 contains 6 7 subroutine calc(t,z) 8 real, intent(in):: t 9 real, dimension(n), intent(out):: z 10 z= 2*n + t 11 end subroutine calc 12 end module tests This simple example works fine in fortran. But when I try to turn it into a python module with f2py, the process fails with exit status 1, stating that: "In function ?f2py_rout_tests_tests_calc?: /tmp/tmpZZiuce/src.linux-x86_64-2.7/testsmodule.c:230:14: error: ?n? undeclared (first use in this function)" I find it strange that this failure only occurs if array z is a dummy argument of the calc subroutine. Otherwise, as in the listing below, f2py doesn't complain. 1 module tests 2 implicit none 3 integer, parameter:: n=3 4 5 contains 6 7 subroutine calc(t,x) 8 real,intent(in):: t 9 real,intent(out):: x 10 real,dimension(n):: z 11 z=0. 12 x= 2*n + t 13 end subroutine calc 14 end module tests So, my problem: is there any way to fix this? I mean, is it possible for f2py to compile a fortran module containing subroutines with parametrized dimension dummy arguments? Or am I missing some trivial tweak here? Thanks, Jose Amoreira -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgarcia at olfac.univ-lyon1.fr Thu Feb 16 12:57:56 2012 From: sgarcia at olfac.univ-lyon1.fr (Samuel Garcia) Date: Thu, 16 Feb 2012 18:57:56 +0100 Subject: [SciPy-User] [ANN] release of Neo 0.2.0 In-Reply-To: References: <4F38D337.7000703@olfac.univ-lyon1.fr> <4F3B73B9.8050408@olfac.univ-lyon1.fr> Message-ID: <4F3D43A4.3030903@olfac.univ-lyon1.fr> Of course. A global convergence would great! (neo team already plane to propose to merge with nitime). But at the moment it is not so easy: nitime = object model + analysis nibabel = IO collections for NI neo = object model + IO collection for electrophysiology The main work is that nitime's object model and neo's object model differ. nitime objects are neutral. Neo object are colored for the use.For example, neo also add containers objects ("one to many" and "many to many" relationship between object) But doing a conversion script could be very easy (if loosing the relationship between objects): neo.AnalogSignal >>> nittime.TimeSerie neo.SpikeTrain >>> nittime.TimeArrays neo.EpochArray >>> list of nitime.Epoch neo.EventArray >>> list of nitime.Event neo.Epoch >>> nitime.Epoch neo.Event >>> nitime.Event Le 15/02/2012 23:00, Ariel Rokem a ?crit : > Hi Samuel, > > Congratulations on the release! > > Concerning the comparison with nitime: At the moment, support for > reading data from files in nitime is only implemented for neuroimaging > data (using nibabel as an optional dependency, only for this bit), but > the core classes in nitime can be used for any kind of time-series > data (including data from single-cell recordings: > http://nipy.sourceforge.net/nitime/examples/grasshopper.html), > assuming the data has already somehow been read into numpy arrays. In > fact, it would be great to have io functions in nitime which read data > using neo and initialize nitime.TimeSeries objects for further analysis. > > Cheers, > > Ariel > > > > On Wed, Feb 15, 2012 at 12:58 AM, Samuel Garcia > > wrote: > > Nitime is oriented for time series of neuro imaging. > > Neo offer a complete data model center around electrophysiology > (in vivo > or simulations) with more objects than nitime (for this fields of > course): > RecodingChannelGroup +RecodingChannel + Segment + Block +Unit+... > > Samuel > > > > > > Le 14/02/2012 16:15, Larry Eisenman a ?crit : > > Samuel Garcia olfac.univ-lyon1.fr > > writes: > > > >> > >> Dear scipy list, > >> We are proud to announce the 0.2.0 release of Neo, a > Python library for > > working with > >> electrophysiology data, whether from biological > experiments or from > >> simulations. > > .... > > > > How does this compare/relate to the Nitime module of the Nipy > project > > (http://nipy.sourceforge.net/nitime/)? > > > > Larry > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Samuel Garcia > Lyon Neuroscience > CNRS - UMR5292 - INSERM U1028 - Universite Claude Bernard LYON 1 > Equipe R et D > 50, avenue Tony Garnier > 69366 LYON Cedex 07 > FRANCE > T?l : 04 37 28 74 24 > Fax : 04 37 28 76 01 > http://olfac.univ-lyon1.fr/unite/equipe-07/ > http://neuralensemble.org/trac/OpenElectrophy > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Samuel Garcia Lyon Neuroscience CNRS - UMR5292 - INSERM U1028 - Universite Claude Bernard LYON 1 Equipe R et D 50, avenue Tony Garnier 69366 LYON Cedex 07 FRANCE T?l : 04 37 28 74 24 Fax : 04 37 28 76 01 http://olfac.univ-lyon1.fr/unite/equipe-07/ http://neuralensemble.org/trac/OpenElectrophy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaakko.luttinen at iki.fi Thu Feb 16 13:09:52 2012 From: jaakko.luttinen at iki.fi (Jaakko Luttinen) Date: Thu, 16 Feb 2012 20:09:52 +0200 Subject: [SciPy-User] "Zero"-shape sparse matrices Message-ID: <4F3D4670.9040906@iki.fi> Hi! To make a long story short, Scipy doesn't seem to allow sparse matrices that have length zero on any of the axes. For instance: C = numpy.ones((0,0)) K = scipy.sparse.csc_matrix(C) ValueError: invalid shape It is possible to create a "zero"-shape dense matrix but not sparse. Why? To me, this seems like a bug.. Is it so? Thanks for any help! Best regards, Jaakko From jaakko.luttinen at iki.fi Thu Feb 16 13:09:53 2012 From: jaakko.luttinen at iki.fi (Jaakko Luttinen) Date: Thu, 16 Feb 2012 20:09:53 +0200 Subject: [SciPy-User] Elementwise products of dense and sparse matrices Message-ID: <4F3D4671.20207@iki.fi> Hi! I am trying to compute elementwise products of dense and sparse matrices. The products work for two dense matrices or for two sparse matrices but the product of a dense and a sparse matrix does not work. See the code below: >>> import numpy as np >>> import scipy as sp >>> A = sp.sparse.csc_matrix(np.identity(5)) >>> B = np.asmatrix(np.ones((5,5))) >>> np.multiply(A,A) <5x5 sparse matrix of type '' with 5 stored elements in Compressed Sparse Column format> >>> np.multiply(B,B) matrix([[ 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1.]]) >>> np.multiply(B,A) NotImplemented >>> np.multiply(A,B) matrix([[ (0, 0) 1.0 (1, 1) 1.0 (2, 2) 1.0 (3, 3) 1.0 (4, 4) 1.0, (0, 0) 1.0 (1, 1) 1.0 (2, 2) 1.0 (3, 3) 1.0 (4, 4) 1.0, ....... (0, 0) 1.0 (1, 1) 1.0 (2, 2) 1.0 (3, 3) 1.0 (4, 4) 1.0]], dtype=object) So elementwise B*A is not implemented and A*B gives some horrible matrix with dtype=object. The A*B or B*A product can be computed as A.multiply(B) but it would be more convenient if np.multiply could handle these situations. Can this be considered as a bug or missing feature or is there some rationale behind this? Thanks for your help! Best, Jaakko From ralf.gommers at googlemail.com Thu Feb 16 13:44:29 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 16 Feb 2012 19:44:29 +0100 Subject: [SciPy-User] ImportError: DLL load failed: The specified module could not be found In-Reply-To: References: Message-ID: On Tue, Feb 14, 2012 at 5:29 AM, Khary Richardson wrote: > 0 down vote favorite > share [g+] share [fb] share [tw] > > Hi I downloaded and ran the numpy-1.6.1-win32-superpack-python3.2.exe, > and I installed scipy-0.10.0.win32-py3.2 from the scipy sourceforge site. > When I try to run a code that uses scipy.special I get the following error > > Traceback (most recent call last): > File "C:\Documents and Settings\Khary\My Documents\PHYSICS\ > Physics\Bessel.py", line 5, in > from scipy.special import jv, jvp > File "C:\Python32\lib\site-packages\scipy\special\__init__.py", line > 525, in > from ._cephes import * > ImportError: DLL load failed: The specified module could not be found. > > any help woulb be great. > > Some more information would be helpful: - are you on XP, Vista, Win7? - how did you install Python (which binary)? - does the file "C:\Python32\lib\site-packages\scipy\special\_cephes.pyd" exist? - can you import and use other scipy packages? - do you get any errors or failures for "import numpy; numpy.test('full')"? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From surfcast23 at gmail.com Thu Feb 16 22:35:47 2012 From: surfcast23 at gmail.com (Khary Richardson) Date: Thu, 16 Feb 2012 22:35:47 -0500 Subject: [SciPy-User] ImportError: DLL load failed: The specified module could not be found In-Reply-To: References: Message-ID: Hi Ralf, I downloaded numpy and scipy from the source forge site linked through the scipy site. I am not getting any other error messages. I am not at home at the moment and am not sure if cephes.py exists or not. I am running xp sp3. Thanks! Khary On Feb 16, 2012 1:44 PM, "Ralf Gommers" wrote: > > > On Tue, Feb 14, 2012 at 5:29 AM, Khary Richardson wrote: > >> 0 down vote favorite >> share [g+] share [fb] share [tw] >> >> Hi I downloaded and ran the numpy-1.6.1-win32-superpack-python3.2.exe, >> and I installed scipy-0.10.0.win32-py3.2 from the scipy sourceforge site. >> When I try to run a code that uses scipy.special I get the following error >> >> Traceback (most recent call last): >> File "C:\Documents and Settings\Khary\My Documents\PHYSICS\ >> Physics\Bessel.py", line 5, in >> from scipy.special import jv, jvp >> File "C:\Python32\lib\site-packages\scipy\special\__init__.py", line >> 525, in >> from ._cephes import * >> ImportError: DLL load failed: The specified module could not be found. >> >> any help woulb be great. >> >> Some more information would be helpful: > - are you on XP, Vista, Win7? > - how did you install Python (which binary)? > - does the file "C:\Python32\lib\site-packages\scipy\special\_cephes.pyd" > exist? > - can you import and use other scipy packages? > - do you get any errors or failures for "import numpy; numpy.test('full')"? > > Ralf > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis-bz-gg at t-online.de Fri Feb 17 05:46:27 2012 From: denis-bz-gg at t-online.de (denis) Date: Fri, 17 Feb 2012 02:46:27 -0800 (PST) Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3AB8AA.2040402@molden.no> References: <4F3AB8AA.2040402@molden.no> Message-ID: <069d65bf-98e1-4053-95ac-005bce93a76b@f30g2000yqh.googlegroups.com> Sturla, that's funny. In the same vein, there are two kinds of customers: - smart, but no money - money, but no smarts. On a related split see Breiman, "Statistical Modeling: The Two Cultures" http://projecteuclid.org/euclid.ss/1009213726 2001 17p + 16 debate related because many splits are cultural, e.g. practitioners vs academics. I can't begin to summarize Breiman -- read it, it's really good. See also http://stats.stackexchange.com/questions/6/the-two-cultures-statistics-vs-machine-learning cheers -- denis On Feb 14, 8:40?pm, Sturla Molden wrote: > After having worked with applied statistics for ~15 years, I have > reached this conclusion... ;-) > > Sturla's 20 propositions on Bayesian vs. classical statistics: From jaakko.luttinen at iki.fi Fri Feb 17 06:04:00 2012 From: jaakko.luttinen at iki.fi (Jaakko Luttinen) Date: Fri, 17 Feb 2012 13:04:00 +0200 Subject: [SciPy-User] Dot product of a matrix and an array Message-ID: <4F3E3420.9030109@iki.fi> Hi! The dot product of a matrix and an array seems to return a matrix. However, this resulting matrix seems to have inconsistent shape. For simplicity, let I be an identity matrix (matrix object) and x a vector (1-d array object). Then np.dot gives wrong dimensions for I*x which causes that one can not compute I*(I*x). See the code below: >>> >>> import numpy as np >>> >>> x = np.arange(5) >>> >>> I = np.asmatrix(np.identity(5)) >>> >>> np.dot(I,x) matrix([[ 0., 1., 2., 3., 4.]]) >>> >>> np.dot(I, np.dot(I,x)) Traceback (most recent call last): File "", line 1, in ValueError: matrices are not aligned I think np.dot(I,x) should return either 1-d vector (array object) or 2-d column vector (array or matrix object), but NOT 2-d row vector because that, in my opinion, is incorrect interpretation of the dot product. Also, I think numpy.dot should return an array object when given an array and a matrix, because the given array might have more than two dimensions (which is okay by the definition of numpy.dot) so the resulting object should be able to handle that. Now numpy.dot seems to give errors in such cases. Best regards, Jaakko From pav at iki.fi Fri Feb 17 06:16:01 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 17 Feb 2012 12:16:01 +0100 Subject: [SciPy-User] Dot product of a matrix and an array In-Reply-To: <4F3E3420.9030109@iki.fi> References: <4F3E3420.9030109@iki.fi> Message-ID: Hi, 17.02.2012 12:04, Jaakko Luttinen kirjoitti: [clip] > >>> import numpy as np > >>> x = np.arange(5) > >>> I = np.asmatrix(np.identity(5)) > >>> np.dot(I,x) > matrix([[ 0., 1., 2., 3., 4.]]) > >>> np.dot(I, np.dot(I,x)) > Traceback (most recent call last): > File "", line 1, in > ValueError: matrices are not aligned > > I think np.dot(I,x) should return either 1-d vector (array object) or > 2-d column vector (array or matrix object), but NOT 2-d row vector > because that, in my opinion, is incorrect interpretation of the dot product. Yep, that's inconsistent behavior. What probably happens is that np.dot(I, x) -> np.asmatrix(np.dot(np.asarray(I), x)) As you can maybe deduce, the matrix class is not as heavily used as the arrays... http://projects.scipy.org/numpy/ticket/2057 -- Pauli Virtanen From pav at iki.fi Fri Feb 17 06:44:20 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 17 Feb 2012 12:44:20 +0100 Subject: [SciPy-User] Elementwise products of dense and sparse matrices In-Reply-To: <4F3D4671.20207@iki.fi> References: <4F3D4671.20207@iki.fi> Message-ID: Hi, 16.02.2012 19:09, Jaakko Luttinen kirjoitti: [clip] > Can this be considered as a bug or missing feature or is there some > rationale behind this? It's a bug and a missing feature. The integration between sparse matrices and the dense-matrix functions (multiply/add/sin/tanh/etc...) in Numpy is not as good as it could be at the moment. http://projects.scipy.org/scipy/ticket/1598 -- Pauli Virtanen From alan.isaac at gmail.com Fri Feb 17 08:55:01 2012 From: alan.isaac at gmail.com (Alan G Isaac) Date: Fri, 17 Feb 2012 08:55:01 -0500 Subject: [SciPy-User] Dot product of a matrix and an array In-Reply-To: <4F3E3420.9030109@iki.fi> References: <4F3E3420.9030109@iki.fi> Message-ID: <4F3E5C35.5090107@gmail.com> On 2/17/2012 6:04 AM, Jaakko Luttinen wrote: > The dot product of a matrix and an array seems to return a matrix. > However, this resulting matrix seems to have inconsistent shape. For > simplicity, let I be an identity matrix (matrix object) and x a vector > (1-d array object). Then np.dot gives wrong dimensions for I*x which > causes that one can not compute I*(I*x). It is unclear what a consistent shape means in this context. (For me it would be to return a 1d array.) For this reason, the suggestion resurfaces from time to time to raise an error for multiplication between a matrix and an array. In any case, computations mixing matrices and arrays are a "bad idea". fwiw, Alan Isaac From nwagner at iam.uni-stuttgart.de Fri Feb 17 09:02:19 2012 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 17 Feb 2012 15:02:19 +0100 Subject: [SciPy-User] Dot product of a matrix and an array In-Reply-To: <4F3E5C35.5090107@gmail.com> References: <4F3E3420.9030109@iki.fi> <4F3E5C35.5090107@gmail.com> Message-ID: On Fri, 17 Feb 2012 08:55:01 -0500 Alan G Isaac wrote: > On 2/17/2012 6:04 AM, Jaakko Luttinen wrote: >> The dot product of a matrix and an array seems to return >>a matrix. >> However, this resulting matrix seems to have >>inconsistent shape. For >> simplicity, let I be an identity matrix (matrix object) >>and x a vector >> (1-d array object). Then np.dot gives wrong dimensions >>for I*x which >> causes that one can not compute I*(I*x). > > > It is unclear what a consistent shape means in this >context. > (For me it would be to return a 1d array.) >For this reason, the suggestion resurfaces from time to >time > to raise an error for multiplication between a matrix >and an array. > In any case, computations mixing matrices and arrays are >a "bad idea". > > fwiw, > Alan Isaac This reminds me of an old ticket http://projects.scipy.org/scipy/ticket/585 Nils From jaakko.luttinen at aalto.fi Thu Feb 16 10:44:40 2012 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Thu, 16 Feb 2012 17:44:40 +0200 Subject: [SciPy-User] "Zero"-shape sparse matrices Message-ID: <4F3D2468.4060700@aalto.fi> Hi! To make a long story short, Scipy doesn't seem to allow sparse matrices that have length zero on any of the axes. For instance: C = numpy.ones((0,0)) K = scipy.sparse.csc_matrix(C) ValueError: invalid shape It is possible to create a "zero"-shape dense matrix but not sparse. Why? To me, this seems like a bug.. Is it so? Thanks for any help! Best regards, Jaakko From jaakko.luttinen at aalto.fi Fri Feb 17 06:00:18 2012 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Fri, 17 Feb 2012 13:00:18 +0200 Subject: [SciPy-User] Dot product of a matrix and an array Message-ID: <4F3E3342.6090808@aalto.fi> Hi! The dot product of a matrix and an array seems to return a matrix. However, this resulting matrix seems to have inconsistent shape. For simplicity, let I be an identity matrix (matrix object) and x a vector (1-d array object). Then np.dot gives wrong dimensions for I*x which causes that one can not compute I*(I*x). See the code below: >>> import numpy as np >>> x = np.arange(5) >>> I = np.asmatrix(np.identity(5)) >>> np.dot(I,x) matrix([[ 0., 1., 2., 3., 4.]]) >>> np.dot(I, np.dot(I,x)) Traceback (most recent call last): File "", line 1, in ValueError: matrices are not aligned I think np.dot(I,x) should return either 1-d vector (array object) or 2-d column vector (array or matrix object), but NOT 2-d row vector because that, in my opinion, is incorrect interpretation of the dot product. Also, I think numpy.dot should return an array object when given an array and a matrix, because the given array might have more than two dimensions (which is okay by the definition of numpy.dot) so the resulting object should be able to handle that. Best regards, Jaakko From e.antero.tammi at gmail.com Fri Feb 17 10:32:23 2012 From: e.antero.tammi at gmail.com (eat) Date: Fri, 17 Feb 2012 17:32:23 +0200 Subject: [SciPy-User] "Zero"-shape sparse matrices In-Reply-To: <4F3D2468.4060700@aalto.fi> References: <4F3D2468.4060700@aalto.fi> Message-ID: Hi, Perhaps a slightly OT (and I'm not really answering to your question), but On Thu, Feb 16, 2012 at 5:44 PM, Jaakko Luttinen wrote: > Hi! > > To make a long story short, Scipy doesn't seem to allow sparse matrices > that have length zero on any of the axes. For instance: > C = numpy.ones((0,0)) > K = scipy.sparse.csc_matrix(C) > ValueError: invalid shape > > It is possible to create a "zero"-shape dense matrix but not sparse. > Why? To me, this seems like a bug.. Is it so? > what would you expect a "zero"-shape sparse (or dense) matrix actually represent? Regards, -eat > > Thanks for any help! > Best regards, > Jaakko > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalomba at austin.rr.com Fri Feb 17 11:35:57 2012 From: apalomba at austin.rr.com (Anthony Palomba) Date: Fri, 17 Feb 2012 10:35:57 -0600 Subject: [SciPy-User] Trouble with numpy on OSX... Message-ID: I am trying to get my scipy environment running on my mac. I have a MBP running OSX 10.7 with python2.7 (python.org) installed. I installed scipy-0.10.0-py2.7-python.org-macosx10.3 and numpy-1.6.1-py2.7-python.org-macosx10.3. When I try to import multiarray, i get the following error... ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/multiarray.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/multiarray.so: no matching architecture in universal wrapper Is there something I am missing? Thanks, Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmutel at gmail.com Fri Feb 17 11:48:58 2012 From: cmutel at gmail.com (Christopher Mutel) Date: Fri, 17 Feb 2012 17:48:58 +0100 Subject: [SciPy-User] "Zero"-shape sparse matrices In-Reply-To: References: <4F3D2468.4060700@aalto.fi> Message-ID: On Fri, Feb 17, 2012 at 4:32 PM, eat wrote: > On Thu, Feb 16, 2012 at 5:44 PM, Jaakko Luttinen > wrote: >> To make a long story short, Scipy doesn't seem to allow sparse matrices >> that have length zero on any of the axes. For instance: >> C = numpy.ones((0,0)) >> K = scipy.sparse.csc_matrix(C) >> ValueError: invalid shape >> >> It is possible to create a "zero"-shape dense matrix but not sparse. >> Why? To me, this seems like a bug.. Is it so? I am not an expert, but it is my understanding that the sparse matrix implementations in SciPy assume precisely two dimensions. One dimension having a size of 0 would break all the assumptions of this code. The NumPy array class is a much more generic container, and was designed from the beginning to allow a number of slicing and dimensionality tricks (see the documentation on numpy striding). You can search through the mailing list from a few years ago to find a discussion about three dimensional sparse matrices, and the conclusion was the same: SciPy supports 2-d (in the sense of two real dimensions) sparse matrices only. -Chris From ralf.gommers at googlemail.com Fri Feb 17 13:04:23 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 17 Feb 2012 19:04:23 +0100 Subject: [SciPy-User] Trouble with numpy on OSX... In-Reply-To: References: Message-ID: On Fri, Feb 17, 2012 at 5:35 PM, Anthony Palomba wrote: > I am trying to get my scipy environment running on my mac. > I have a MBP running OSX 10.7 with python2.7 (python.org) > installed. > > I installed scipy-0.10.0-py2.7-python.org-macosx10.3 > and numpy-1.6.1-py2.7-python.org-macosx10.3. > > When I try to import multiarray, i get the following error... > > ImportError: > dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/multiarray.so, > 2): no suitable image found. Did find: > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/multiarray.so: > no matching architecture in universal wrapper > > Is there something I am missing? > > Did you notice that there are two installers for Python 2.7 on OS X? One ends in "macosx10.3" and one in "macosx10.6". You probably picked the Python installer on python.org ending in 10.6, resulting in a 64-bit Python and 32-bit numpy/scipy. This would explain the error you're seeing. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From harpend at gmail.com Fri Feb 17 13:21:59 2012 From: harpend at gmail.com (Henry Harpending) Date: Fri, 17 Feb 2012 11:21:59 -0700 Subject: [SciPy-User] Trouble with bumpy on OSX Message-ID: Anthony Palomba wrote: > I am trying to get my scipy environment running on my mac. > I have a MBP running OSX 10.7 with python2.7 (python.org) > installed. > > I installed scipy-0.10.0-py2.7-python.org-macosx10.3 > and numpy-1.6.1-py2.7-python.org-macosx10.3. > > When I try to import multiarray, i get the following error... > > ImportError: > dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/multiarray.so, > 2): no suitable image found. Did find: > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/multiarray.so: > no matching architecture in universal wrapper > > Is there something I am missing? I have no trouble using the Enthought Python Distribution on OS X: http://enthought.com/products/epd_free.php and they have a fuller pay version (free to academics) also. I am happy with just scipy and pylab in the free version. Henry Harpending -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Feb 17 22:21:06 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 17 Feb 2012 22:21:06 -0500 Subject: [SciPy-User] [OT] Bayesian vs. frequentist In-Reply-To: <4F3AC694.8080205@molden.no> References: <4F3AB8AA.2040402@molden.no> <4F3AC694.8080205@molden.no> Message-ID: On Tue, Feb 14, 2012 at 3:39 PM, Sturla Molden wrote: > On 14.02.2012 21:24, josef.pktd at gmail.com wrote: > >> Do you expect an argument? sounds a bit like http://andrewgelman.com/ > > No I don't. I am just tired of explaining that "significant p-value" > does not imply "very important effect". Sorry for spamming the list with > my rant. Since I just came across a paper on the topic http://www.anesthesia-analgesia.org/content/112/3/678.short while looking for something else. I have seen now several papers (mainly the abstracts) that test "equivalence" instead of "equality", so we can specify in advance what difference is "important" and what would be "equivalent". (not Bayesian) Josef > > Sturla > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From vanforeest at gmail.com Sat Feb 18 14:08:46 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sat, 18 Feb 2012 20:08:46 +0100 Subject: [SciPy-User] Fwd: In-Reply-To: References: Message-ID: Hi, Is there a general routine in numpy/scipy to find the left most root of a 1-d function? As an example: import numpy as np import pylab as pl grid = np.arange(0, 120., 0.1) f = 3 - 0.1*grid + np.sin(grid) #pl.plot(grid, f) #pl.show() # simplistic method to find the left root: # since I know that f(0) > 0: i = 0 while(f[i]>=0): ? ?i+= 1 print i, grid[i], f[i] This algorithm works, provided a suitable initial condition, but it not particularly efficient. I know that there are fast root solvers in scipy.optimize, but they require to provide an a and b such that f(a) < 0 < f(b) (or the other way around). In my general case, these bounds are not easy to obtain (and I can easily create more taxing problems for which I would like to have a general, robust and efficient method to find the left-most root). Thanks Nicky From gustavo.goretkin at gmail.com Sun Feb 19 11:52:25 2012 From: gustavo.goretkin at gmail.com (Gustavo Goretkin) Date: Sun, 19 Feb 2012 11:52:25 -0500 Subject: [SciPy-User] masked recarray, recarray with one field of type "ndarray" In-Reply-To: References: Message-ID: In short: do recarrays support masking? On Wed, Feb 1, 2012 at 11:36 AM, Gustavo Goretkin wrote: > Thanks for the help! Now is there any way to mask elements of a > recarray? I should explain the application because I think I may be > going about this the wrong way: > I'll be building a tree and each node will have some attributes (for > example, a matrix). I often have to iterate through every node of the > tree and do a calculation -- something that I could do in a vectorized > way with NumPy if all the attributes were stored in an array. So I > thought I could represent the tree as a recarray (that I'd > occasionally need to grow). > > I'd also need to delete nodes from the tree occasionally. I'd > accomplish this by masking entries of the recarray. When I needed to > add a node to the tree, I'd try to populate a masked entry before > going to the end of the array. > > On Tue, Jan 31, 2012 at 9:33 AM, Warren Weckesser > wrote: >> >> >> On Tue, Jan 31, 2012 at 2:36 AM, Gustavo Goretkin >> wrote: >>> >>> Does a recarray support masking? >>> >>> Can I have a recarray where one of the fields is an M-by-N ndarray >>> (not recarray) of some dtype? >>> ex: a = np.recarray(shape=(10),formats=['i4','f8','3-by-3 ndarray of >>> dtype=float64']) >> >> >> >> Here's how it can be done with the dtype argument (in this case, the >> "sub-arrays" are 3x5 float32): >> >> In [21]: dt = np.dtype([('id', int32), ('values', float32, (3,5))]) >> >> In [22]: a = np.recarray(shape=(3,), dtype=dt) >> >> In [23]: a.id >> Out[23]: array([????? 7, 2345536, 8585218]) >> >> In [24]: a[0].id >> Out[24]: 7 >> >> In [25]: a[0].values >> Out[25]: >> array([[? 9.80908925e-45,?? 2.15997513e-37,?? 3.16079124e-39, >> ????????? 1.18408375e-38,?? 2.81552923e-38], >> ?????? [? 2.13004362e-37,? -7.69011974e-02,?? 9.80908925e-45, >> ????????? 9.80908925e-45,?? 3.62636667e-21], >> ?????? [? 5.67059093e-24,?? 5.67095065e-24,?? 5.64768872e-24, >> ????????? 7.86448908e+11,?? 0.00000000e+00]], dtype=float32) >> >> In [26]: a[0].values.shape >> Out[26]: (3, 5) >> >> >> Warren >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From ralf.gommers at googlemail.com Mon Feb 20 02:51:50 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 20 Feb 2012 08:51:50 +0100 Subject: [SciPy-User] ANN: SciPy 0.10.1 release candidate 2 Message-ID: Hi all, I am pleased to announce the availability of the second release candidate of SciPy 0.10.1. Please try out this release and report any problems on the scipy-dev mailing list. If no new problems are found, the final release will be available in one week. Sources and binaries can be found at http://sourceforge.net/projects/scipy/files/scipy/0.10.1rc2/, release notes are copied below. Things that were fixed between RC1 and RC2: - a compile error with MSVC9 against NumPy master - include missing Bento files in source release - some Python 3.x test warnings - an issue with misc.imread when PIL is not installed - Arpack single-precision test failures - cleaned up DeprecationWarnings when Umfpack is not installed Cheers, Ralf ========================== SciPy 0.10.1 Release Notes ========================== .. contents:: SciPy 0.10.1 is a bug-fix release with no new features compared to 0.10.0. Main changes ------------ The most important changes are:: 1. The single precision routines of ``eigs`` and ``eigsh`` in ``scipy.sparse.linalg`` have been disabled (they internally use double precision now). 2. A compatibility issue related to changes in NumPy macros has been fixed, in order to make scipy 0.10.1 compile with the upcoming numpy 1.7.0 release. Other issues fixed ------------------ - #835: stats: nan propagation in stats.distributions - #1202: io: netcdf segfault - #1531: optimize: make curve_fit work with method as callable. - #1560: linalg: fixed mistake in eig_banded documentation. - #1565: ndimage: bug in ndimage.variance - #1457: ndimage: standard_deviation does not work with sequence of indexes - #1562: cluster: segfault in linkage function - #1568: stats: One-sided fisher_exact() returns `p` < 1 for 0 successful attempts - #1575: stats: zscore and zmap handle the axis keyword incorrectly Checksums ========= b119828c64a68794c9562f8228dd7cf9 release/installers/scipy-0.10.1rc2-py2.7-python.org-macosx10.6.dmg 605a30b8a33ff6763261ffde59a38bb9 release/installers/scipy-0.10.1rc2-win32-superpack-python2.5.exe 5a056ed6dbb9abd10bd824a64c47c159 release/installers/scipy-0.10.1rc2-win32-superpack-python2.6.exe 0498bb3f48d0cb251cb9e527b454a50b release/installers/scipy-0.10.1rc2-win32-superpack-python2.7.exe f078ebc55d1b7d832474c8379852062c release/installers/scipy-0.10.1rc2-win32-superpack-python3.1.exe 3ee82aebb2c1d0425fb85c72c6eea80e release/installers/scipy-0.10.1rc2-win32-superpack-python3.2.exe 540ec78cb451bfebbd8f84193fe76581 release/installers/scipy-0.10.1rc2.tar.gz 4813d52623ae63ed492e28bdcecee4c0 release/installers/scipy-0.10.1rc2.zip -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.anton.letnes at gmail.com Mon Feb 20 04:02:10 2012 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Mon, 20 Feb 2012 10:02:10 +0100 Subject: [SciPy-User] How to use module parameters in dimension() In-Reply-To: <1595876.WsfY7zjZfd@mu.site> References: <1595876.WsfY7zjZfd@mu.site> Message-ID: Hi, I was just thinking that you could probably avoid the whole issue by using assumed-shape arrays. This actually gives you more checks and balances when using e.g. bounds checking with your fortran compiler (gfortran: -fbounds-check). Explicit size does not, afaik. Paul module tests implicit none contains subroutine calc(t,z) implicit none real, intent(in):: t real, dimension(:), intent(out):: z integer :: n n = size(z) z = 2*n + t end subroutine calc end module tests On 16. feb. 2012, at 18:49, Jose Amoreira wrote: > Hello > I have a module that defines the dimension of an array as a parameter, and a subroutine in that module that computes and returns that array dimension: > > 1 module tests > 2 implicit none > 3 integer, parameter:: n=3 !Dimension of arrays > 4 > 5 contains > 6 > 7 subroutine calc(t,z) > 8 real, intent(in):: t > 9 real, dimension(n), intent(out):: z > 10 z= 2*n + t > 11 end subroutine calc > 12 end module tests > > This simple example works fine in fortran. But when I try to turn it into a python module with f2py, the process fails with exit status 1, stating that: > "In function ?f2py_rout_tests_tests_calc?: > /tmp/tmpZZiuce/src.linux-x86_64-2.7/testsmodule.c:230:14: error: ?n? undeclared (first use in this function)" > > I find it strange that this failure only occurs if array z is a dummy argument of the calc subroutine. Otherwise, as in the listing below, f2py doesn't complain. > > 1 module tests > 2 implicit none > 3 integer, parameter:: n=3 > 4 > 5 contains > 6 > 7 subroutine calc(t,x) > 8 real,intent(in):: t > 9 real,intent(out):: x > 10 real,dimension(n):: z > 11 z=0. > 12 x= 2*n + t > 13 end subroutine calc > 14 end module tests > > So, my problem: is there any way to fix this? I mean, is it possible for f2py to compile a fortran module containing subroutines with parametrized dimension dummy arguments? Or am I missing some trivial tweak here? > > Thanks, > Jose Amoreira > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From ljmamoreira at gmail.com Mon Feb 20 05:09:26 2012 From: ljmamoreira at gmail.com (Jose Amoreira) Date: Mon, 20 Feb 2012 10:09:26 +0000 Subject: [SciPy-User] How to use module parameters in dimension() In-Reply-To: References: <1595876.WsfY7zjZfd@mu.site> Message-ID: <5196073.PrtyTcbHl1@mu.site> Thank you very much, Paul. You were a big help. I started rewriting my subroutines using arrays bounds explicitly specified in the argument lists. Your suggestion is a lot better. (Still, it's troubling that array size parameters of subroutine dummy arguments can not be shared across a module, as other data can.) Thanks again Jose Amoreira On Monday, February 20, 2012 10:02:10 AM Paul Anton Letnes wrote: > Hi, > > I was just thinking that you could probably avoid the whole issue by using > assumed-shape arrays. This actually gives you more checks and balances when > using e.g. bounds checking with your fortran compiler (gfortran: > -fbounds-check). Explicit size does not, afaik. > > Paul > > module tests > implicit none > > contains > > subroutine calc(t,z) > implicit none > real, intent(in):: t > real, dimension(:), intent(out):: z > integer :: n > n = size(z) > z = 2*n + t > end subroutine calc > end module tests > > On 16. feb. 2012, at 18:49, Jose Amoreira wrote: > > Hello > > > > I have a module that defines the dimension of an array as a parameter, and a subroutine in that module that computes and returns that array dimension: > > 1 module tests > > 2 implicit none > > 3 integer, parameter:: n=3 !Dimension of arrays > > 4 > > 5 contains > > 6 > > 7 subroutine calc(t,z) > > 8 real, intent(in):: t > > 9 real, dimension(n), intent(out):: z > > > > 10 z= 2*n + t > > 11 end subroutine calc > > 12 end module tests > > > > This simple example works fine in fortran. But when I try to turn it > > into a python module with f2py, the process fails with exit status 1, > > stating that: "In function ?f2py_rout_tests_tests_calc?: > > /tmp/tmpZZiuce/src.linux-x86_64-2.7/testsmodule.c:230:14: error: ?n? > > undeclared (first use in this function)" > > > > I find it strange that this failure only occurs if array z is a dummy > > argument of the calc subroutine. Otherwise, as in the listing below, > > f2py doesn't complain.> > > 1 module tests > > 2 implicit none > > 3 integer, parameter:: n=3 > > 4 > > 5 contains > > 6 > > 7 subroutine calc(t,x) > > 8 real,intent(in):: t > > 9 real,intent(out):: x > > > > 10 real,dimension(n):: z > > 11 z=0. > > 12 x= 2*n + t > > 13 end subroutine calc > > 14 end module tests > > > > So, my problem: is there any way to fix this? I mean, is it possible for > > f2py to compile a fortran module containing subroutines with > > parametrized dimension dummy arguments? Or am I missing some trivial > > tweak here? > > > > Thanks, > > Jose Amoreira > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From rishi.ou at gmail.com Mon Feb 20 19:00:57 2012 From: rishi.ou at gmail.com (RDX) Date: Mon, 20 Feb 2012 16:00:57 -0800 (PST) Subject: [SciPy-User] [SciPy-user] Error fmin Message-ID: <33359168.post@talk.nabble.com> I am using fmin and it returns initial guess as the final solution. def Traug (v): err=0 Traug_RHOB = v[0] + v[1]*(z_den_s/3125.0)**v[2] count =arange(size(den_s)) for i in count: err += abs(den_s[i] - Traug_RHOB [i]) return err vO=[1.6, 0.5, 0.35] t=fmin(Traug,vO ) a=t[0]; b=t[1]; c=t[2]; Code gives me a=1.6, b=0.5, c=0.35, which is my initial guess. What is wrong? -- View this message in context: http://old.nabble.com/Error-fmin-tp33359168p33359168.html Sent from the Scipy-User mailing list archive at Nabble.com. From servant.mathieu at gmail.com Tue Feb 21 04:38:45 2012 From: servant.mathieu at gmail.com (servant mathieu) Date: Tue, 21 Feb 2012 10:38:45 +0100 Subject: [SciPy-User] simplex algorithm and curve fitting Message-ID: Dear Scipy users, I've got some troubles with the scipy.optimize.curve_fit function. This function is based on the Levenburg-Maquardt algorithm, which is extremely rapid but usually finds a local minimum, not a global one (and thus often returns anormal parameter values). In my research field, we usually use Nelder Mead's simplex routines to avoid this problem. However, I don't know if it is possible to perform curve fitting in scipy using simplex; the fmin function doesn't seem to perform adjustments to data. Here is my code for fitting a three parameters hyperbolic cotangent function using curve_fit: from scipy.optimize import curve_fit import numpy as np def func (x, A,k, r ): return r + (A /(k*x)) * np.tanh (A*k*x) xdata = np.array ([0.15,0.25,0.35,0.45, 0.55, 0.75]) datacomp = np.array ([344.3276300, 324.0051063, 314.2693475, 309.9906375,309.9251162, 307.3955800]) dataincomp = np.array ([363.3839888, 343.5735787, 334.6013375, 327.7868238, 329.4642550, 328.0667050]) poptcomp, pcovcomp = curve_fit (func, xdata, datacomp, maxfev = 10000) poptincomp, pcovincomp = curve_fit (func, xdata, dataincomp, maxfev = 10000) How could I proceed to perform the fitting using simplex? Best, Mat -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Feb 21 05:05:58 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 21 Feb 2012 05:05:58 -0500 Subject: [SciPy-User] simplex algorithm and curve fitting In-Reply-To: References: Message-ID: On Tue, Feb 21, 2012 at 4:38 AM, servant mathieu wrote: > Dear Scipy users, > > I've got some troubles with the?scipy.optimize.curve_fit function. This > function is based on the Levenburg-Maquardt algorithm, which is extremely > rapid but usually finds a local minimum, not a global one (and thus often > returns anormal parameter values). In my research field, we usually use > Nelder Mead's simplex routines to avoid this problem. However, I don't know > if it is possible to perform curve fitting in scipy using simplex; the fmin > function? doesn't seem to perform adjustments to data. > > Here is my code for fitting a three parameters hyperbolic cotangent function > using curve_fit: > > from scipy.optimize import curve_fit > > import numpy as np > > > > def func (x, A,k, r ): > > return r + (A /(k*x)) * np.tanh (A*k*x) > > > > xdata = np.array ([0.15,0.25,0.35,0.45, 0.55, 0.75]) > > > > datacomp = np.array ([344.3276300, 324.0051063, 314.2693475, > 309.9906375,309.9251162, 307.3955800]) > > dataincomp = np.array ([363.3839888, 343.5735787, 334.6013375, 327.7868238, > 329.4642550, 328.0667050]) > > > > poptcomp, pcovcomp = curve_fit (func, xdata, datacomp, maxfev = 10000) > > poptincomp, pcovincomp = curve_fit (func, xdata, dataincomp, maxfev = 10000) > > > > > > How could I?proceed to perform the fitting using simplex? You need to define your own loss function for the optimizers like fmin, or in future version minimize http://docs.scipy.org/doc/scipy-0.10.0/reference/tutorial/optimize.html#nelder-mead-simplex-algorithm-fmin something like this def loss(params, args) A, k, r = params y, x = args return ((y - func (x, A,k, r ))**2).sum() and use loss in the call to fmin Josef > > > > Best, > > Mat > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From servant.mathieu at gmail.com Tue Feb 21 05:19:49 2012 From: servant.mathieu at gmail.com (servant mathieu) Date: Tue, 21 Feb 2012 11:19:49 +0100 Subject: [SciPy-User] simplex algorithm and curve fitting In-Reply-To: References: Message-ID: Thanks a lot Joseph. I've got a last question which concerns initials guess for parameters. In fact, in many papers, i read " we fitted both functions using standard simplex optimization routines. This was repeated 10000 times with randomized initial values to avoid local minima." The problemis the following: which range of values should we use for this randomization? Best, Mat 2012/2/21 > On Tue, Feb 21, 2012 at 4:38 AM, servant mathieu > wrote: > > Dear Scipy users, > > > > I've got some troubles with the scipy.optimize.curve_fit function. This > > function is based on the Levenburg-Maquardt algorithm, which is extremely > > rapid but usually finds a local minimum, not a global one (and thus often > > returns anormal parameter values). In my research field, we usually use > > Nelder Mead's simplex routines to avoid this problem. However, I don't > know > > if it is possible to perform curve fitting in scipy using simplex; the > fmin > > function doesn't seem to perform adjustments to data. > > > > Here is my code for fitting a three parameters hyperbolic cotangent > function > > using curve_fit: > > > > from scipy.optimize import curve_fit > > > > import numpy as np > > > > > > > > def func (x, A,k, r ): > > > > return r + (A /(k*x)) * np.tanh (A*k*x) > > > > > > > > xdata = np.array ([0.15,0.25,0.35,0.45, 0.55, 0.75]) > > > > > > > > datacomp = np.array ([344.3276300, 324.0051063, 314.2693475, > > 309.9906375,309.9251162, 307.3955800]) > > > > dataincomp = np.array ([363.3839888, 343.5735787, 334.6013375, > 327.7868238, > > 329.4642550, 328.0667050]) > > > > > > > > poptcomp, pcovcomp = curve_fit (func, xdata, datacomp, maxfev = 10000) > > > > poptincomp, pcovincomp = curve_fit (func, xdata, dataincomp, maxfev = > 10000) > > > > > > > > > > > > How could I proceed to perform the fitting using simplex? > > You need to define your own loss function for the optimizers like > fmin, or in future version minimize > > http://docs.scipy.org/doc/scipy-0.10.0/reference/tutorial/optimize.html#nelder-mead-simplex-algorithm-fmin > > something like this > > def loss(params, args) > A, k, r = params > y, x = args > return ((y - func (x, A,k, r ))**2).sum() > > and use loss in the call to fmin > > Josef > > > > > > > > > > Best, > > > > Mat > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Feb 21 05:47:49 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 21 Feb 2012 05:47:49 -0500 Subject: [SciPy-User] simplex algorithm and curve fitting In-Reply-To: References: Message-ID: On Tue, Feb 21, 2012 at 5:19 AM, servant mathieu wrote: > Thanks a lot Joseph. I've got a last question which concerns initials guess > for parameters. In fact, in many papers, i read " we fitted both functions > using standard simplex optimization routines. This was repeated 10000 times > with randomized initial values to avoid local minima."? The problemis the > following: which range of values should we use for this randomization? I wanted to mention this before, all fmin are local optimizers, anneal is the only global optimizer in scipy, but a bit tricky to use. There are other global optimizers written in python that might work better, but I never tried any of these packages. Choosing (random) starting values depends completely on the function and there is no function independent recipe, since the parameterization of a function is pretty arbitrary. So, you need an "educated" guess over the possible range given the specific function and problem. For specific classes of functions in not too high dimension it would be possible to find (and code) starting values, or ranges for a random search. I haven't tried out what your function looks like, but I would guess that there are at least some sign restrictions. I usually try to see if I can guess starting values, and ranges for randomization, based on min, max and mean of the observations. Josef > > Best, > Mat > > > 2012/2/21 > >> On Tue, Feb 21, 2012 at 4:38 AM, servant mathieu >> wrote: >> > Dear Scipy users, >> > >> > I've got some troubles with the?scipy.optimize.curve_fit function. This >> > function is based on the Levenburg-Maquardt algorithm, which is >> > extremely >> > rapid but usually finds a local minimum, not a global one (and thus >> > often >> > returns anormal parameter values). In my research field, we usually use >> > Nelder Mead's simplex routines to avoid this problem. However, I don't >> > know >> > if it is possible to perform curve fitting in scipy using simplex; the >> > fmin >> > function? doesn't seem to perform adjustments to data. >> > >> > Here is my code for fitting a three parameters hyperbolic cotangent >> > function >> > using curve_fit: >> > >> > from scipy.optimize import curve_fit >> > >> > import numpy as np >> > >> > >> > >> > def func (x, A,k, r ): >> > >> > return r + (A /(k*x)) * np.tanh (A*k*x) >> > >> > >> > >> > xdata = np.array ([0.15,0.25,0.35,0.45, 0.55, 0.75]) >> > >> > >> > >> > datacomp = np.array ([344.3276300, 324.0051063, 314.2693475, >> > 309.9906375,309.9251162, 307.3955800]) >> > >> > dataincomp = np.array ([363.3839888, 343.5735787, 334.6013375, >> > 327.7868238, >> > 329.4642550, 328.0667050]) >> > >> > >> > >> > poptcomp, pcovcomp = curve_fit (func, xdata, datacomp, maxfev = 10000) >> > >> > poptincomp, pcovincomp = curve_fit (func, xdata, dataincomp, maxfev = >> > 10000) >> > >> > >> > >> > >> > >> > How could I?proceed to perform the fitting using simplex? >> >> You need to define your own loss function for the optimizers like >> fmin, or in future version minimize >> >> http://docs.scipy.org/doc/scipy-0.10.0/reference/tutorial/optimize.html#nelder-mead-simplex-algorithm-fmin >> >> something like this >> >> def loss(params, args) >> ? ? A, k, r = params >> ? ? y, x = args >> ? ? return ((y - func (x, A,k, r ))**2).sum() >> >> and use loss in the call to fmin >> >> Josef >> > > > > > >> >> > >> > >> > >> > Best, >> > >> > Mat >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Tue Feb 21 06:29:04 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 21 Feb 2012 06:29:04 -0500 Subject: [SciPy-User] simplex algorithm and curve fitting In-Reply-To: References: Message-ID: On Tue, Feb 21, 2012 at 5:47 AM, wrote: > On Tue, Feb 21, 2012 at 5:19 AM, servant mathieu > wrote: >> Thanks a lot Joseph. I've got a last question which concerns initials guess >> for parameters. In fact, in many papers, i read " we fitted both functions >> using standard simplex optimization routines. This was repeated 10000 times >> with randomized initial values to avoid local minima."? The problemis the >> following: which range of values should we use for this randomization? > > I wanted to mention this before, all fmin are local optimizers, anneal > is the only global optimizer in scipy, but a bit tricky to use. There > are other global optimizers written in python that might work better, > but I never tried any of these packages. > > Choosing (random) starting values depends completely on the function > and there is no function independent recipe, since the > parameterization of a function is pretty arbitrary. So, you need an > "educated" guess over the possible range given the specific function > and problem. > > For specific classes of functions in not too high dimension it would > be possible to find (and code) starting values, or ranges for a random > search. > > I haven't tried out what your function looks like, but I would guess > that there are at least some sign restrictions. I usually try to see > if I can guess starting values, and ranges for randomization, based on > min, max and mean of the observations. (can you please bottom post in this mailing list, it's difficult to find the thread) tanh looks bad for optimization, it's essentially flat, -1 or 1 outside of (-4,4) or so. playing a bit, fmin and curve_fit just find solutions with the tanh part equal to 1 or -1, fmin finds -1, curve_fit picks +1 (with the same starting values) If you want any action from the tanh part, then, it looks to me, A*k*x would need to be restricted to be mostly in the (-4,4) range, maybe a reparameterization (with b = A*k) would help. Josef > > Josef > >> >> Best, >> Mat >> >> >> 2012/2/21 >> >>> On Tue, Feb 21, 2012 at 4:38 AM, servant mathieu >>> wrote: >>> > Dear Scipy users, >>> > >>> > I've got some troubles with the?scipy.optimize.curve_fit function. This >>> > function is based on the Levenburg-Maquardt algorithm, which is >>> > extremely >>> > rapid but usually finds a local minimum, not a global one (and thus >>> > often >>> > returns anormal parameter values). In my research field, we usually use >>> > Nelder Mead's simplex routines to avoid this problem. However, I don't >>> > know >>> > if it is possible to perform curve fitting in scipy using simplex; the >>> > fmin >>> > function? doesn't seem to perform adjustments to data. >>> > >>> > Here is my code for fitting a three parameters hyperbolic cotangent >>> > function >>> > using curve_fit: >>> > >>> > from scipy.optimize import curve_fit >>> > >>> > import numpy as np >>> > >>> > >>> > >>> > def func (x, A,k, r ): >>> > >>> > return r + (A /(k*x)) * np.tanh (A*k*x) >>> > >>> > >>> > >>> > xdata = np.array ([0.15,0.25,0.35,0.45, 0.55, 0.75]) >>> > >>> > >>> > >>> > datacomp = np.array ([344.3276300, 324.0051063, 314.2693475, >>> > 309.9906375,309.9251162, 307.3955800]) >>> > >>> > dataincomp = np.array ([363.3839888, 343.5735787, 334.6013375, >>> > 327.7868238, >>> > 329.4642550, 328.0667050]) >>> > >>> > >>> > >>> > poptcomp, pcovcomp = curve_fit (func, xdata, datacomp, maxfev = 10000) >>> > >>> > poptincomp, pcovincomp = curve_fit (func, xdata, dataincomp, maxfev = >>> > 10000) >>> > >>> > >>> > >>> > >>> > >>> > How could I?proceed to perform the fitting using simplex? >>> >>> You need to define your own loss function for the optimizers like >>> fmin, or in future version minimize >>> >>> http://docs.scipy.org/doc/scipy-0.10.0/reference/tutorial/optimize.html#nelder-mead-simplex-algorithm-fmin >>> >>> something like this >>> >>> def loss(params, args) >>> ? ? A, k, r = params >>> ? ? y, x = args >>> ? ? return ((y - func (x, A,k, r ))**2).sum() >>> >>> and use loss in the call to fmin >>> >>> Josef >>> >> >> >> >> >> >>> >>> > >>> > >>> > >>> > Best, >>> > >>> > Mat >>> > >>> > >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From servant.mathieu at gmail.com Tue Feb 21 06:30:03 2012 From: servant.mathieu at gmail.com (servant mathieu) Date: Tue, 21 Feb 2012 12:30:03 +0100 Subject: [SciPy-User] simplex algorithm and curve fitting In-Reply-To: References: Message-ID: Ok, so as far as I understand, curve fitting is not possible in scipy.. Many researchers use the fminsearch function in matlab based on simplex standard routines. Is it a global optimizer? Mat 2012/2/21 > On Tue, Feb 21, 2012 at 5:19 AM, servant mathieu > wrote: > > Thanks a lot Joseph. I've got a last question which concerns initials > guess > > for parameters. In fact, in many papers, i read " we fitted both > functions > > using standard simplex optimization routines. This was repeated 10000 > times > > with randomized initial values to avoid local minima." The problemis the > > following: which range of values should we use for this randomization? > > I wanted to mention this before, all fmin are local optimizers, anneal > is the only global optimizer in scipy, but a bit tricky to use. There > are other global optimizers written in python that might work better, > but I never tried any of these packages. > > Choosing (random) starting values depends completely on the function > and there is no function independent recipe, since the > parameterization of a function is pretty arbitrary. So, you need an > "educated" guess over the possible range given the specific function > and problem. > > For specific classes of functions in not too high dimension it would > be possible to find (and code) starting values, or ranges for a random > search. > > I haven't tried out what your function looks like, but I would guess > that there are at least some sign restrictions. I usually try to see > if I can guess starting values, and ranges for randomization, based on > min, max and mean of the observations. > > Josef > > > > > Best, > > Mat > > > > > > 2012/2/21 > > > >> On Tue, Feb 21, 2012 at 4:38 AM, servant mathieu > >> wrote: > >> > Dear Scipy users, > >> > > >> > I've got some troubles with the scipy.optimize.curve_fit function. > This > >> > function is based on the Levenburg-Maquardt algorithm, which is > >> > extremely > >> > rapid but usually finds a local minimum, not a global one (and thus > >> > often > >> > returns anormal parameter values). In my research field, we usually > use > >> > Nelder Mead's simplex routines to avoid this problem. However, I don't > >> > know > >> > if it is possible to perform curve fitting in scipy using simplex; the > >> > fmin > >> > function doesn't seem to perform adjustments to data. > >> > > >> > Here is my code for fitting a three parameters hyperbolic cotangent > >> > function > >> > using curve_fit: > >> > > >> > from scipy.optimize import curve_fit > >> > > >> > import numpy as np > >> > > >> > > >> > > >> > def func (x, A,k, r ): > >> > > >> > return r + (A /(k*x)) * np.tanh (A*k*x) > >> > > >> > > >> > > >> > xdata = np.array ([0.15,0.25,0.35,0.45, 0.55, 0.75]) > >> > > >> > > >> > > >> > datacomp = np.array ([344.3276300, 324.0051063, 314.2693475, > >> > 309.9906375,309.9251162, 307.3955800]) > >> > > >> > dataincomp = np.array ([363.3839888, 343.5735787, 334.6013375, > >> > 327.7868238, > >> > 329.4642550, 328.0667050]) > >> > > >> > > >> > > >> > poptcomp, pcovcomp = curve_fit (func, xdata, datacomp, maxfev = 10000) > >> > > >> > poptincomp, pcovincomp = curve_fit (func, xdata, dataincomp, maxfev = > >> > 10000) > >> > > >> > > >> > > >> > > >> > > >> > How could I proceed to perform the fitting using simplex? > >> > >> You need to define your own loss function for the optimizers like > >> fmin, or in future version minimize > >> > >> > http://docs.scipy.org/doc/scipy-0.10.0/reference/tutorial/optimize.html#nelder-mead-simplex-algorithm-fmin > >> > >> something like this > >> > >> def loss(params, args) > >> A, k, r = params > >> y, x = args > >> return ((y - func (x, A,k, r ))**2).sum() > >> > >> and use loss in the call to fmin > >> > >> Josef > >> > > > > > > > > > > > >> > >> > > >> > > >> > > >> > Best, > >> > > >> > Mat > >> > > >> > > >> > _______________________________________________ > >> > SciPy-User mailing list > >> > SciPy-User at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Feb 21 07:07:33 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 21 Feb 2012 07:07:33 -0500 Subject: [SciPy-User] simplex algorithm and curve fitting In-Reply-To: References: Message-ID: On Tue, Feb 21, 2012 at 6:30 AM, servant mathieu wrote: > Ok, so?as far as I understand, curve fitting is not possible in scipy..?Many > researchers?use the fminsearch function in matlab based on simplex standard > routines.? Is it a global optimizer? fminsearch in matlab uses 'Nelder-Mead simplex direct search' which is the same algorithm as scipy.optimize.fmin My impression is that your function does not have a solution with np.abs(tanh(A*k*x)) not equal to one, I didn't find one that would come close to the solution when the tanh part is one. curve_fitting is possible in scipy and works very well in many cases, and it finds a solution based on the A/(k*x) part, but it cannot do something impossible, but I didn't try 10000 starting values. In general, if it is a quadratic problem, then leastsq works very well, better than fmin or other fmin_xxx for unconstrained problems. Josef > Mat > > 2012/2/21 >> >> On Tue, Feb 21, 2012 at 5:19 AM, servant mathieu >> wrote: >> > Thanks a lot Joseph. I've got a last question which concerns initials >> > guess >> > for parameters. In fact, in many papers, i read " we fitted both >> > functions >> > using standard simplex optimization routines. This was repeated 10000 >> > times >> > with randomized initial values to avoid local minima."? The problemis >> > the >> > following: which range of values should we use for this randomization? >> >> I wanted to mention this before, all fmin are local optimizers, anneal >> is the only global optimizer in scipy, but a bit tricky to use. There >> are other global optimizers written in python that might work better, >> but I never tried any of these packages. >> >> Choosing (random) starting values depends completely on the function >> and there is no function independent recipe, since the >> parameterization of a function is pretty arbitrary. So, you need an >> "educated" guess over the possible range given the specific function >> and problem. >> >> For specific classes of functions in not too high dimension it would >> be possible to find (and code) starting values, or ranges for a random >> search. >> >> I haven't tried out what your function looks like, but I would guess >> that there are at least some sign restrictions. I usually try to see >> if I can guess starting values, and ranges for randomization, based on >> min, max and mean of the observations. >> >> Josef >> >> > >> > Best, >> > Mat >> > >> > >> > 2012/2/21 >> > >> >> On Tue, Feb 21, 2012 at 4:38 AM, servant mathieu >> >> wrote: >> >> > Dear Scipy users, >> >> > >> >> > I've got some troubles with the?scipy.optimize.curve_fit function. >> >> > This >> >> > function is based on the Levenburg-Maquardt algorithm, which is >> >> > extremely >> >> > rapid but usually finds a local minimum, not a global one (and thus >> >> > often >> >> > returns anormal parameter values). In my research field, we usually >> >> > use >> >> > Nelder Mead's simplex routines to avoid this problem. However, I >> >> > don't >> >> > know >> >> > if it is possible to perform curve fitting in scipy using simplex; >> >> > the >> >> > fmin >> >> > function? doesn't seem to perform adjustments to data. >> >> > >> >> > Here is my code for fitting a three parameters hyperbolic cotangent >> >> > function >> >> > using curve_fit: >> >> > >> >> > from scipy.optimize import curve_fit >> >> > >> >> > import numpy as np >> >> > >> >> > >> >> > >> >> > def func (x, A,k, r ): >> >> > >> >> > return r + (A /(k*x)) * np.tanh (A*k*x) >> >> > >> >> > >> >> > >> >> > xdata = np.array ([0.15,0.25,0.35,0.45, 0.55, 0.75]) >> >> > >> >> > >> >> > >> >> > datacomp = np.array ([344.3276300, 324.0051063, 314.2693475, >> >> > 309.9906375,309.9251162, 307.3955800]) >> >> > >> >> > dataincomp = np.array ([363.3839888, 343.5735787, 334.6013375, >> >> > 327.7868238, >> >> > 329.4642550, 328.0667050]) >> >> > >> >> > >> >> > >> >> > poptcomp, pcovcomp = curve_fit (func, xdata, datacomp, maxfev = >> >> > 10000) >> >> > >> >> > poptincomp, pcovincomp = curve_fit (func, xdata, dataincomp, maxfev = >> >> > 10000) >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > How could I?proceed to perform the fitting using simplex? >> >> >> >> You need to define your own loss function for the optimizers like >> >> fmin, or in future version minimize >> >> >> >> >> >> http://docs.scipy.org/doc/scipy-0.10.0/reference/tutorial/optimize.html#nelder-mead-simplex-algorithm-fmin >> >> >> >> something like this >> >> >> >> def loss(params, args) >> >> ? ? A, k, r = params >> >> ? ? y, x = args >> >> ? ? return ((y - func (x, A,k, r ))**2).sum() >> >> >> >> and use loss in the call to fmin >> >> >> >> Josef >> >> >> > >> > >> > >> > >> > >> >> >> >> > >> >> > >> >> > >> >> > Best, >> >> > >> >> > Mat >> >> > >> >> > >> >> > _______________________________________________ >> >> > SciPy-User mailing list >> >> > SciPy-User at scipy.org >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From servant.mathieu at gmail.com Tue Feb 21 07:22:58 2012 From: servant.mathieu at gmail.com (servant mathieu) Date: Tue, 21 Feb 2012 13:22:58 +0100 Subject: [SciPy-User] simplex algorithm and curve fitting In-Reply-To: References: Message-ID: Thanks for all Joseph, I perfectly understand the problem. Please correct me if I'm wrong, but if it finally appears that curve_fit is an excellent optimization solver, provided that you put relevant initial values for parameters... Best, Mat 2012/2/21 > On Tue, Feb 21, 2012 at 6:30 AM, servant mathieu > wrote: > > Ok, so as far as I understand, curve fitting is not possible in > scipy.. Many > > researchers use the fminsearch function in matlab based on simplex > standard > > routines. Is it a global optimizer? > > fminsearch in matlab uses 'Nelder-Mead simplex direct search' which is > the same algorithm as scipy.optimize.fmin > > My impression is that your function does not have a solution with > np.abs(tanh(A*k*x)) not equal to one, I didn't find one that would > come close to the solution when the tanh part is one. > > curve_fitting is possible in scipy and works very well in many cases, > and it finds a solution based on the A/(k*x) part, but it cannot do > something impossible, but I didn't try 10000 starting values. > > In general, if it is a quadratic problem, then leastsq works very > well, better than fmin or other fmin_xxx for unconstrained problems. > > Josef > > > Mat > > > > 2012/2/21 > >> > >> On Tue, Feb 21, 2012 at 5:19 AM, servant mathieu > >> wrote: > >> > Thanks a lot Joseph. I've got a last question which concerns initials > >> > guess > >> > for parameters. In fact, in many papers, i read " we fitted both > >> > functions > >> > using standard simplex optimization routines. This was repeated 10000 > >> > times > >> > with randomized initial values to avoid local minima." The problemis > >> > the > >> > following: which range of values should we use for this randomization? > >> > >> I wanted to mention this before, all fmin are local optimizers, anneal > >> is the only global optimizer in scipy, but a bit tricky to use. There > >> are other global optimizers written in python that might work better, > >> but I never tried any of these packages. > >> > >> Choosing (random) starting values depends completely on the function > >> and there is no function independent recipe, since the > >> parameterization of a function is pretty arbitrary. So, you need an > >> "educated" guess over the possible range given the specific function > >> and problem. > >> > >> For specific classes of functions in not too high dimension it would > >> be possible to find (and code) starting values, or ranges for a random > >> search. > >> > >> I haven't tried out what your function looks like, but I would guess > >> that there are at least some sign restrictions. I usually try to see > >> if I can guess starting values, and ranges for randomization, based on > >> min, max and mean of the observations. > >> > >> Josef > >> > >> > > >> > Best, > >> > Mat > >> > > >> > > >> > 2012/2/21 > >> > > >> >> On Tue, Feb 21, 2012 at 4:38 AM, servant mathieu > >> >> wrote: > >> >> > Dear Scipy users, > >> >> > > >> >> > I've got some troubles with the scipy.optimize.curve_fit function. > >> >> > This > >> >> > function is based on the Levenburg-Maquardt algorithm, which is > >> >> > extremely > >> >> > rapid but usually finds a local minimum, not a global one (and thus > >> >> > often > >> >> > returns anormal parameter values). In my research field, we usually > >> >> > use > >> >> > Nelder Mead's simplex routines to avoid this problem. However, I > >> >> > don't > >> >> > know > >> >> > if it is possible to perform curve fitting in scipy using simplex; > >> >> > the > >> >> > fmin > >> >> > function doesn't seem to perform adjustments to data. > >> >> > > >> >> > Here is my code for fitting a three parameters hyperbolic cotangent > >> >> > function > >> >> > using curve_fit: > >> >> > > >> >> > from scipy.optimize import curve_fit > >> >> > > >> >> > import numpy as np > >> >> > > >> >> > > >> >> > > >> >> > def func (x, A,k, r ): > >> >> > > >> >> > return r + (A /(k*x)) * np.tanh (A*k*x) > >> >> > > >> >> > > >> >> > > >> >> > xdata = np.array ([0.15,0.25,0.35,0.45, 0.55, 0.75]) > >> >> > > >> >> > > >> >> > > >> >> > datacomp = np.array ([344.3276300, 324.0051063, 314.2693475, > >> >> > 309.9906375,309.9251162, 307.3955800]) > >> >> > > >> >> > dataincomp = np.array ([363.3839888, 343.5735787, 334.6013375, > >> >> > 327.7868238, > >> >> > 329.4642550, 328.0667050]) > >> >> > > >> >> > > >> >> > > >> >> > poptcomp, pcovcomp = curve_fit (func, xdata, datacomp, maxfev = > >> >> > 10000) > >> >> > > >> >> > poptincomp, pcovincomp = curve_fit (func, xdata, dataincomp, > maxfev = > >> >> > 10000) > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > How could I proceed to perform the fitting using simplex? > >> >> > >> >> You need to define your own loss function for the optimizers like > >> >> fmin, or in future version minimize > >> >> > >> >> > >> >> > http://docs.scipy.org/doc/scipy-0.10.0/reference/tutorial/optimize.html#nelder-mead-simplex-algorithm-fmin > >> >> > >> >> something like this > >> >> > >> >> def loss(params, args) > >> >> A, k, r = params > >> >> y, x = args > >> >> return ((y - func (x, A,k, r ))**2).sum() > >> >> > >> >> and use loss in the call to fmin > >> >> > >> >> Josef > >> >> > >> > > >> > > >> > > >> > > >> > > >> >> > >> >> > > >> >> > > >> >> > > >> >> > Best, > >> >> > > >> >> > Mat > >> >> > > >> >> > > >> >> > _______________________________________________ > >> >> > SciPy-User mailing list > >> >> > SciPy-User at scipy.org > >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> >> > > >> >> _______________________________________________ > >> >> SciPy-User mailing list > >> >> SciPy-User at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> > > >> > > >> > _______________________________________________ > >> > SciPy-User mailing list > >> > SciPy-User at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Feb 21 07:55:33 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 21 Feb 2012 13:55:33 +0100 Subject: [SciPy-User] simplex algorithm and curve fitting In-Reply-To: References: Message-ID: 21.02.2012 13:22, servant mathieu kirjoitti: > Thanks for all Joseph, I perfectly understand the problem. Please > correct me if I'm wrong, but if it finally appears that curve_fit is an > excellent optimization solver, provided that you put relevant initial > values for parameters... This applies to the majority of optimization algorithms. The situation is no different e.g. for Matlab/fminsearch (or its identical Scipy counterpart). -- Pauli Virtanen From jaakko.luttinen at aalto.fi Tue Feb 21 08:15:06 2012 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Tue, 21 Feb 2012 15:15:06 +0200 Subject: [SciPy-User] "Zero"-shape sparse matrices In-Reply-To: References: <4F3D2468.4060700@aalto.fi> Message-ID: <4F4398DA.3070208@aalto.fi> On 02/17/2012 05:32 PM, eat wrote: > Hi, > > Perhaps a slightly OT (and I'm not really answering to your question), but > > On Thu, Feb 16, 2012 at 5:44 PM, Jaakko Luttinen > > wrote: > > Hi! > > To make a long story short, Scipy doesn't seem to allow sparse matrices > that have length zero on any of the axes. For instance: > C = numpy.ones((0,0)) > K = scipy.sparse.csc_matrix(C) > ValueError: invalid shape > > It is possible to create a "zero"-shape dense matrix but not sparse. > Why? To me, this seems like a bug.. Is it so? > > what would you expect a "zero"-shape sparse (or dense) > matrix actually represent? "Zero"-shape matrix would represent an empty matrix which has a correct shape for some operations. This is very convenient for generic code because I don't need to check if some dimension has zero length. For instance: - horizontal concatenation of (10,3), (10,2) and (10,0) shaped matrices would work and produce a (10,5) shaped matrix. - forming a block matrix from four matrices having shapes (0,0), (0,20), (10,0) and (10,20) would produce a (10,20) matrix using numpy.bmat. Regards, Jaakko From jaakko.luttinen at aalto.fi Tue Feb 21 08:23:28 2012 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Tue, 21 Feb 2012 15:23:28 +0200 Subject: [SciPy-User] "Zero"-shape sparse matrices In-Reply-To: References: <4F3D2468.4060700@aalto.fi> Message-ID: <4F439AD0.1040808@aalto.fi> On 02/17/2012 06:48 PM, Christopher Mutel wrote: > On Fri, Feb 17, 2012 at 4:32 PM, eat wrote: >> On Thu, Feb 16, 2012 at 5:44 PM, Jaakko Luttinen >> wrote: >>> To make a long story short, Scipy doesn't seem to allow sparse matrices >>> that have length zero on any of the axes. For instance: >>> C = numpy.ones((0,0)) >>> K = scipy.sparse.csc_matrix(C) >>> ValueError: invalid shape >>> >>> It is possible to create a "zero"-shape dense matrix but not sparse. >>> Why? To me, this seems like a bug.. Is it so? > > I am not an expert, but it is my understanding that the sparse matrix > implementations in SciPy assume precisely two dimensions. One > dimension having a size of 0 would break all the assumptions of this > code. The NumPy array class is a much more generic container, and was > designed from the beginning to allow a number of slicing and > dimensionality tricks (see the documentation on numpy striding). You > can search through the mailing list from a few years ago to find a > discussion about three dimensional sparse matrices, and the conclusion > was the same: SciPy supports 2-d (in the sense of two real dimensions) > sparse matrices only. Thanks for your answer! I think a matrix with shape (10,0) would be as "2-d" as a (10,1) shaped matrix. Both have two dimensions, but neither one has both axes longer than 1. I don't mean to consider 0-d, 1-d, 3-d or N-d matrices, but empty 2-d matrices. I just don't see why there is this limitation for sparse matrices that zero is not a valid length for an axis. Regards, Jaakko From jaakko.luttinen at aalto.fi Tue Feb 21 08:42:18 2012 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Tue, 21 Feb 2012 15:42:18 +0200 Subject: [SciPy-User] einsum for sparse matrices? Message-ID: <4F439F3A.2010301@aalto.fi> Hi! The function numpy.einsum is an awesome function. Is it being implemented for sparse matrices too? Would it be easy/possible? Just curious, because I think that it could solve some of the problems related to the integration between sparse and dense matrices because several functions are just special cases of einsum. Regards, Jaakko From rishi.ou at gmail.com Mon Feb 20 15:08:38 2012 From: rishi.ou at gmail.com (RDX) Date: Mon, 20 Feb 2012 12:08:38 -0800 (PST) Subject: [SciPy-User] [SciPy-user] error fmin Message-ID: <33359168.post@talk.nabble.com> Below is I am using fmin and it returns initial guess as the final solution. def Traug (v): err=0 Traug_RHOB = v[0] + v[1]*(z_den_s/3125.0)**v[2] count =arange(size(den_s)) for i in count: err += abs(den_s[i] - Traug_RHOB [i]) return err vO=[1.6, 0.5, 0.35] t=fmin(Traug,vO ) a=t[0]; b=t[1]; c=t[2]; Code gives me a=1.6, b=0.5, c=0.35, which is my initial guess. What is wrong? -- View this message in context: http://old.nabble.com/error-fmin-tp33359168p33359168.html Sent from the Scipy-User mailing list archive at Nabble.com. From rishi.ou at gmail.com Mon Feb 20 15:11:51 2012 From: rishi.ou at gmail.com (RDX) Date: Mon, 20 Feb 2012 12:11:51 -0800 (PST) Subject: [SciPy-User] [SciPy-user] error fmin Message-ID: <33359168.post@talk.nabble.com> I am using fmin and it returns initial guess as the final solution. def Traug (v): err=0 Traug_RHOB = v[0] + v[1]*(z_den_s/3125.0)**v[2] count =arange(size(den_s)) for i in count: err += abs(den_s[i] - Traug_RHOB [i]) return err vO=[1.6, 0.5, 0.35] t=fmin(Traug,vO ) a=t[0]; b=t[1]; c=t[2]; Code gives me a=1.6, b=0.5, c=0.35, which is my initial guess. What is wrong? -- View this message in context: http://old.nabble.com/error-fmin-tp33359168p33359168.html Sent from the Scipy-User mailing list archive at Nabble.com. From bjorn.nyberg.10 at aberdeen.ac.uk Mon Feb 20 17:14:29 2012 From: bjorn.nyberg.10 at aberdeen.ac.uk (Nyberg, Bjorn Johan) Date: Mon, 20 Feb 2012 22:14:29 +0000 Subject: [SciPy-User] Focal Majority Message-ID: Hi Everyone, I have an interesting problem and I was hoping I could get some ideas here. I want to apply a focal majority within a moving 3 x 3 window whereby if there is a majority (any type of majority i.e. more than 5 cells having the same value), assign the center cell a value of 1 otherwise assign a value of 0. Now I realize in scipy and numpy there are options with convolutions methods but I am not entirely certain of how to apply a condition statement that I would require into the window calculations. Thanks Nyberg -------------- next part -------------- An HTML attachment was scrubbed... URL: From sneher at gwdg.de Tue Feb 21 06:45:44 2012 From: sneher at gwdg.de (SiggiN) Date: Tue, 21 Feb 2012 03:45:44 -0800 (PST) Subject: [SciPy-User] [SciPy-user] find_objects() even slice dimentions.. Message-ID: <33363336.post@talk.nabble.com> Hallo, I'm using the find_objects() function for detecting objects in a 2d-array. Is there a way to let the find_objects() function only return slices with even dimensions in x and y. Or is there a easy way to manipulate the slices? The only thing I was able to do was simple string manipulation. I need even dimensions for performing symmetry tests. Thank you Siggi -- View this message in context: http://old.nabble.com/find_objects%28%29-even-slice-dimentions..-tp33363336p33363336.html Sent from the Scipy-User mailing list archive at Nabble.com. From sneher at gwdg.de Tue Feb 21 06:47:23 2012 From: sneher at gwdg.de (SiggiN) Date: Tue, 21 Feb 2012 03:47:23 -0800 (PST) Subject: [SciPy-User] [SciPy-user] find_objects() even slice dimensions Message-ID: <33363336.post@talk.nabble.com> Hallo, I'm using the find_objects() function for detecting objects in a 2d-array. Is there a way to let the find_objects() function only return slices with even dimensions in x and y. Or is there a easy way to manipulate the slices? The only thing I was able to do was simple string manipulation. I need even dimensions for performing symmetry tests. Thank you Siggi -- View this message in context: http://old.nabble.com/find_objects%28%29-even-slice-dimensions-tp33363336p33363336.html Sent from the Scipy-User mailing list archive at Nabble.com. From sneher at gwdg.de Tue Feb 21 08:11:56 2012 From: sneher at gwdg.de (SiggiN) Date: Tue, 21 Feb 2012 05:11:56 -0800 (PST) Subject: [SciPy-User] [SciPy-user] find_objects() even slice dimensions Message-ID: <33363336.post@talk.nabble.com> Hallo, I'm using the find_objects() function for detecting objects in a 2d-array. Is there a way to let the find_objects() function only return slices with even dimensions in x and y. Or is there a easy way to manipulate the slices? The only thing I was able to do was simple string manipulation. I need even dimensions for performing symmetry tests. Thank you Siggi -- View this message in context: http://old.nabble.com/find_objects%28%29-even-slice-dimensions-tp33363336p33363336.html Sent from the Scipy-User mailing list archive at Nabble.com. From sneher at gwdg.de Tue Feb 21 08:44:57 2012 From: sneher at gwdg.de (SiggiN) Date: Tue, 21 Feb 2012 05:44:57 -0800 (PST) Subject: [SciPy-User] [SciPy-user] find_objects() even slice dimensions Message-ID: <33363336.post@talk.nabble.com> Hallo, I'm using the find_objects() function for detecting objects in a 2d-array. Is there a way to let the find_objects() function only return slices with even dimensions in x and y. Or is there a easy way to manipulate the slices? The only thing I was able to do was simple string manipulation. I need even dimensions for performing symmetry tests. Thank you Siggi -- View this message in context: http://old.nabble.com/find_objects%28%29-even-slice-dimensions-tp33363336p33363336.html Sent from the Scipy-User mailing list archive at Nabble.com. From barbara.padova at gmail.com Tue Feb 21 09:12:49 2012 From: barbara.padova at gmail.com (barbara padova) Date: Tue, 21 Feb 2012 15:12:49 +0100 Subject: [SciPy-User] Delaunay Message-ID: I am noticing an unexplained behaviour when use scipy's (0.9.0) Delaunay triangulation routine. My points are UTM coordinates stored in numpy.array([[easting, northing], [easting, northing], [easting, northing]]). If I use the true coordinates Scipy's edges are missing some of my points. If I calculate the mean_easting and the mean_northing values and change my array in numpy.array([[easting-mean_easting , northing-mean_northing ], [easting-mean_easting , northing-mean_northing ], [easting-mean_easting , northing-mean_northing ]]), the result it's OK. I know that there are rounding errors and I also tried using the data type float64, but whit my original date in UTM I have the same problems. There is a solution to use my original data with scipy's Delaunay triangulation routine? Thanks for any help! Best regards, Barbara -------------- next part -------------- An HTML attachment was scrubbed... URL: From barbara.padova at gmail.com Tue Feb 21 09:26:11 2012 From: barbara.padova at gmail.com (barbara.padova) Date: Tue, 21 Feb 2012 06:26:11 -0800 (PST) Subject: [SciPy-User] Scipy's Delaunay triangulation Message-ID: I am noticing an unexplained behaviour when use scipy's (0.9.0) Delaunay triangulation routine. My points are UTM coordinates stored in numpy.array([[easting, northing], [easting, northing], [easting, northing]]). If I use the true coordinates Scipy's edges are missing some of my points. If I calculate the mean_easting and the mean_northing values and change my array in numpy.array([[easting-mean_easting , northing- mean_northing ], [easting-mean_easting , northing-mean_northing ], [easting-mean_easting , northing-mean_northing ]]), the result it's OK. I know that there are rounding errors and I also tried using the data type float64, but whit my original date in UTM I have the same problems. There is a solution to use my original data with scipy's Delaunay triangulation routine? Thanks for any help! Best regards, Barbara Padova From gael.varoquaux at normalesup.org Tue Feb 21 09:37:26 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 21 Feb 2012 15:37:26 +0100 Subject: [SciPy-User] [SciPy-user] find_objects() even slice dimensions In-Reply-To: <33363336.post@talk.nabble.com> References: <33363336.post@talk.nabble.com> Message-ID: <20120221143726.GA29335@phare.normalesup.org> On Tue, Feb 21, 2012 at 05:44:57AM -0800, SiggiN wrote: > Or is there a easy way to manipulate the slices? The only thing I was able > to do was simple string manipulation. Slices are standard Python object. They can be instanciatated using 'slice(start, stop, step)' and start, stop, step can be accessed as attributes of these objects. http://docs.python.org/library/functions.html#slice HTH, Gael From warren.weckesser at enthought.com Tue Feb 21 10:00:44 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 21 Feb 2012 09:00:44 -0600 Subject: [SciPy-User] "Zero"-shape sparse matrices In-Reply-To: <4F439AD0.1040808@aalto.fi> References: <4F3D2468.4060700@aalto.fi> <4F439AD0.1040808@aalto.fi> Message-ID: On Tue, Feb 21, 2012 at 7:23 AM, Jaakko Luttinen wrote: > On 02/17/2012 06:48 PM, Christopher Mutel wrote: > > On Fri, Feb 17, 2012 at 4:32 PM, eat wrote: > >> On Thu, Feb 16, 2012 at 5:44 PM, Jaakko Luttinen < > jaakko.luttinen at aalto.fi> > >> wrote: > >>> To make a long story short, Scipy doesn't seem to allow sparse matrices > >>> that have length zero on any of the axes. For instance: > >>> C = numpy.ones((0,0)) > >>> K = scipy.sparse.csc_matrix(C) > >>> ValueError: invalid shape > >>> > >>> It is possible to create a "zero"-shape dense matrix but not sparse. > >>> Why? To me, this seems like a bug.. Is it so? > > > > I am not an expert, but it is my understanding that the sparse matrix > > implementations in SciPy assume precisely two dimensions. One > > dimension having a size of 0 would break all the assumptions of this > > code. The NumPy array class is a much more generic container, and was > > designed from the beginning to allow a number of slicing and > > dimensionality tricks (see the documentation on numpy striding). You > > can search through the mailing list from a few years ago to find a > > discussion about three dimensional sparse matrices, and the conclusion > > was the same: SciPy supports 2-d (in the sense of two real dimensions) > > sparse matrices only. > > Thanks for your answer! > > I think a matrix with shape (10,0) would be as "2-d" as a (10,1) shaped > matrix. Both have two dimensions, but neither one has both axes longer > than 1. I don't mean to consider 0-d, 1-d, 3-d or N-d matrices, but > empty 2-d matrices. I just don't see why there is this limitation for > sparse matrices that zero is not a valid length for an axis. > > In principle, sparse matrices should behave as much like dense matrices as possible. Since numpy ndarrays and matrices allow a dimension to have length 0, it is a reasonable expectation for sparse matrices to allow this also. Could you file a ticket for this? Click on the "bug reports" link here: http://projects.scipy.org/scipy Thanks, Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaakko.luttinen at aalto.fi Tue Feb 21 10:58:37 2012 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Tue, 21 Feb 2012 17:58:37 +0200 Subject: [SciPy-User] "Zero"-shape sparse matrices In-Reply-To: References: <4F3D2468.4060700@aalto.fi> <4F439AD0.1040808@aalto.fi> Message-ID: <4F43BF2D.1070403@aalto.fi> On 02/21/2012 05:00 PM, Warren Weckesser wrote: > > > On Tue, Feb 21, 2012 at 7:23 AM, Jaakko Luttinen > > wrote: > > On 02/17/2012 06:48 PM, Christopher Mutel wrote: > > On Fri, Feb 17, 2012 at 4:32 PM, eat > wrote: > >> On Thu, Feb 16, 2012 at 5:44 PM, Jaakko Luttinen > > > >> wrote: > >>> To make a long story short, Scipy doesn't seem to allow sparse > matrices > >>> that have length zero on any of the axes. For instance: > >>> C = numpy.ones((0,0)) > >>> K = scipy.sparse.csc_matrix(C) > >>> ValueError: invalid shape > >>> > >>> It is possible to create a "zero"-shape dense matrix but not sparse. > >>> Why? To me, this seems like a bug.. Is it so? > > > > I am not an expert, but it is my understanding that the sparse matrix > > implementations in SciPy assume precisely two dimensions. One > > dimension having a size of 0 would break all the assumptions of this > > code. The NumPy array class is a much more generic container, and was > > designed from the beginning to allow a number of slicing and > > dimensionality tricks (see the documentation on numpy striding). You > > can search through the mailing list from a few years ago to find a > > discussion about three dimensional sparse matrices, and the conclusion > > was the same: SciPy supports 2-d (in the sense of two real dimensions) > > sparse matrices only. > > Thanks for your answer! > > I think a matrix with shape (10,0) would be as "2-d" as a (10,1) shaped > matrix. Both have two dimensions, but neither one has both axes longer > than 1. I don't mean to consider 0-d, 1-d, 3-d or N-d matrices, but > empty 2-d matrices. I just don't see why there is this limitation for > sparse matrices that zero is not a valid length for an axis. > > > > In principle, sparse matrices should behave as much like dense matrices > as possible. Since numpy ndarrays and matrices allow a dimension to > have length 0, it is a reasonable expectation for sparse matrices to > allow this also. Could you file a ticket for this? Click on the "bug > reports" link here: http://projects.scipy.org/scipy Ok, thanks! http://projects.scipy.org/scipy/ticket/1602 -Jaakko From Dieter.Werthmuller at ed.ac.uk Tue Feb 21 13:32:21 2012 From: Dieter.Werthmuller at ed.ac.uk (=?ISO-8859-1?Q?Dieter_Werthm=FCller?=) Date: Tue, 21 Feb 2012 18:32:21 +0000 Subject: [SciPy-User] [SciPy-user] adaptive simulated annealing (ASA) In-Reply-To: <4F391176.6080906@ed.ac.uk> References: <4F391176.6080906@ed.ac.uk> Message-ID: <4F43E335.7060809@ed.ac.uk> Dear reader, I give it another try with more explanation. If it remains unanswered I will assume that such an implementation does not exist. Adaptive Simulated Annealing (ASA) is a freely available code for simulated annealing, see http://www.ingber.com/#ASA. In 2007, Georg Holzmann was asking this list if there is an implementation of that powerful code within SciPy, see http://mail.scipy.org/pipermail/scipy-user/2007-November/014549.html . The responses back then were negative, suggesting Georg to wrap it himself and referencing to swig.org. I was wondering if the state of that issue, python wrapper for the ASA code, changed in the last 4.5 years. And if not, is swig.org still the best place to look for starting a wrapper myself? I am aware that there is a Matlab wrapper for ASA, with is +/- up-to-date (Feb 2011), see http://ssakata.sdf.org/software/ . I don't know if this would help to create a wrapper for Python. Many thanks, Dieter On 13/02/12 13:34, Dieter Werthm?ller wrote: > Hi there, > > The message 'adaptive simulated annealing (ASA)' is over four years old > now, asking for a python implementation of ASA > (http://www.ingber.com/#ASA). > > I was wondering if such an implementation is around today? > > Kind regards, > Dieter -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From aronne.merrelli at gmail.com Tue Feb 21 13:58:56 2012 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Tue, 21 Feb 2012 12:58:56 -0600 Subject: [SciPy-User] Focal Majority In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 4:14 PM, Nyberg, Bjorn Johan < bjorn.nyberg.10 at aberdeen.ac.uk> wrote: > Hi Everyone, > I have an interesting problem and I was hoping I could get some ideas > here. I want to apply a focal majority within a moving 3 x 3 window whereby > if there is a majority (any type of majority i.e. more than 5 cells having > the same value), assign the center cell a value of 1 otherwise assign a > value of 0. Now I realize in scipy and numpy there are options with > convolutions methods but I am not entirely certain of how to apply a > condition statement that I would require into the window calculations. > > Thanks > *Nyberg* > > I don't think a convolution would work. A convolution is really just a weighted sum, so I can't see a way to mimic a sort or conditional that way. But, I think you can do this with scipy.ndimage.rank_filter. If you want 5 cells with the same value, it should be equivalent to checking if the first and fifth ranked elements are the same (or second and sixth, etc...). So a loop through the window size, combining rank_filter calls, should do this. Definitely double check me on this - I'm not 100% sure it is doing the the correct thing, and it probably isn't doing what you want at the edges. If this is not fast enough, then I would consider writing a "brute force" loop in Cython to make it fast. In [90]: z Out[90]: array([[1, 1, 0, 8], [8, 1, 3, 1], [3, 1, 1, 2], [3, 1, 4, 5]]) In [91]: zmask = np.zeros(z.shape, bool) In [92]: for n in range(4): zmask = np.logical_or(zmask, rank_filter(z,n,3)==rank_filter(z,(n-5),3)) In [93]: zmask Out[93]: array([[ True, True, False, False], [ True, True, True, False], [False, False, True, False], [ True, False, False, False]], dtype=bool) HTH, Aronne -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Feb 21 14:15:03 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 21 Feb 2012 14:15:03 -0500 Subject: [SciPy-User] [SciPy-user] adaptive simulated annealing (ASA) In-Reply-To: <4F43E335.7060809@ed.ac.uk> References: <4F391176.6080906@ed.ac.uk> <4F43E335.7060809@ed.ac.uk> Message-ID: 2012/2/21 Dieter Werthm?ller : > Dear reader, > > I give it another try with more explanation. If it remains unanswered I > will assume that such an implementation does not exist. No answer and google search doesn't show anything, it looks like "such an implementation does not exist" my impression is global optimization is not very common in this neighborhood. Josef > > Adaptive Simulated Annealing (ASA) is a freely available code for > simulated annealing, see http://www.ingber.com/#ASA. > > In 2007, Georg Holzmann was asking this list if there is an > implementation of that powerful code within SciPy, see > http://mail.scipy.org/pipermail/scipy-user/2007-November/014549.html . > > The responses back then were negative, suggesting Georg to wrap it > himself and referencing to swig.org. > > I was wondering if the state of that issue, python wrapper for the ASA > code, changed in the last 4.5 years. And if not, is swig.org still the > best place to look for starting a wrapper myself? > > I am aware that there is a Matlab wrapper for ASA, with is +/- > up-to-date (Feb 2011), see http://ssakata.sdf.org/software/ . I don't > know if this would help to create a wrapper for Python. > > Many thanks, > Dieter > > On 13/02/12 13:34, Dieter Werthm?ller wrote: >> Hi there, >> >> The message 'adaptive simulated annealing (ASA)' is over four years old >> now, asking for a python implementation of ASA >> (http://www.ingber.com/#ASA). >> >> I was wondering if such an implementation is around today? >> >> Kind regards, >> Dieter > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From njs at pobox.com Tue Feb 21 15:00:44 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 21 Feb 2012 20:00:44 +0000 Subject: [SciPy-User] [SciPy-user] adaptive simulated annealing (ASA) In-Reply-To: References: <4F391176.6080906@ed.ac.uk> <4F43E335.7060809@ed.ac.uk> Message-ID: On Tue, Feb 21, 2012 at 7:15 PM, wrote: > 2012/2/21 Dieter Werthm?ller : >> Dear reader, >> >> I give it another try with more explanation. If it remains unanswered I >> will assume that such an implementation does not exist. > > No answer and google search doesn't show anything, it looks like "such > an implementation does not exist" > > my impression is global optimization is not very common in this neighborhood. Which is to say, probably no-one has gotten around to it (unless Georg Holzmann did and neglected to tell anyone), but I'm sure people would be interested if you were to write and make such code available. SWIG remains a fine way to implement Python wrappers, but I prefer Cython myself. If you go with Cython, the Demos/callback directory in the source tree has a good example of passing a Python function (like a cost function) to a C library: http://cython.org/release/Cython-0.15/Demos/callback/ Another option for is 'ctypes' (which is included with the standard Python distribution, and lets you avoid dealing with C compilers entirely). -- N From pav at iki.fi Tue Feb 21 17:37:48 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 21 Feb 2012 23:37:48 +0100 Subject: [SciPy-User] Scipy's Delaunay triangulation In-Reply-To: References: Message-ID: Hi, 21.02.2012 15:26, barbara.padova kirjoitti: > I am noticing an unexplained behaviour when use scipy's (0.9.0) > Delaunay triangulation routine. My points are UTM coordinates stored > in numpy.array([[easting, northing], [easting, northing], [easting, > northing]]). > If I use the true coordinates Scipy's edges are missing some of my > points. [clip] Delaunay triangulations are not well-defined for all input data sets, and are often even less well-defined numerically. The behavior of Scipy's triangulation routine is inherited from the computational geometry library it uses, Qhull (http://qhull.org/). In some cases, Qhull can exclude points that are ambiguous due to numerical precision from the triangulation, and I think this is what you observe. Scipy runs qhull with options "d Qz Qbb Qt" plus the defaults; you can look up the precise meanings here: http://qhull.org/html/qh-quick.htm One could argue that Scipy should raise an error in ambiguous cases, and that we should not pass "Qbb" and "Qz" by default, but rather let the user customize how they want to deal with the triangulation failure. These "numerically bad" cases are not uncommon, though, and I'm not 100% sure it is possible to tell Qhull to always include all points (see [1]). > There is a solution to use my original data with scipy's Delaunay > triangulation routine? It's a limitation of the algorithm. One option is to deduce which points are missing, and live with the fact (e.g. considering them "merged" them with the nearest points in the triangulation in your own code, or just ignoring them). Qhull also has an option "QJ" which adds numerical noise to the least significant digits of the input data, so that results become numerically better defined. Unfortunately, at the moment the Scipy interface does not allow passing this option down to Qhull. You should however be able to achieve a similar effect by adding noise to the input data also yourself, e.g. numpy.random.seed(1234) joggled = points * ( 1 + 1e-8*(numpy.random.rand(points.shape) - 0.5)) tri = Delaunay(joggled) it's not exactly the same as the "QJ" option, but should behave in the same way. -- Pauli Virtanen [1] Try feeding the following to qdelaunay: 3 10 -0.5 -0.5 -0.5 -0.5 -0.5 0.5 -0.5 0.5 -0.5 -0.5 0.5 0.5 0.5 -0.5 -0.5 0.5 -0.5 0.5 0.5 0.5 -0.5 0.5 0.5 0.5000000000000003 0.5 -0.5 0.500000000000001 0.5 0.5 0.5 From rishi.ou at gmail.com Wed Feb 22 08:41:39 2012 From: rishi.ou at gmail.com (RDX) Date: Wed, 22 Feb 2012 05:41:39 -0800 (PST) Subject: [SciPy-User] [SciPy-user] Error using fmin Message-ID: <33370870.post@talk.nabble.com> I am using fmin and it returns initial guess as the final solution. def Traug (v): err=0 Traug_RHOB = v[0] + v[1]*(z_den_s/3125.0)**v[2] count =arange(size(den_s)) for i in count: err += abs(den_s[i] - Traug_RHOB [i]) return err vO=[1.6, 0.5, 0.35] t=fmin(Traug,vO ) a=t[0]; b=t[1]; c=t[2]; Code gives me a=1.6, b=0.5, c=0.35, which is my initial guess. What is wrong? Many thanks in advance. Regards, Rishi -- View this message in context: http://old.nabble.com/Error-using-fmin-tp33370870p33370870.html Sent from the Scipy-User mailing list archive at Nabble.com. From josef.pktd at gmail.com Wed Feb 22 08:53:26 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 22 Feb 2012 08:53:26 -0500 Subject: [SciPy-User] [SciPy-user] Error using fmin In-Reply-To: <33370870.post@talk.nabble.com> References: <33370870.post@talk.nabble.com> Message-ID: On Wed, Feb 22, 2012 at 8:41 AM, RDX wrote: > > I am using fmin and it returns initial guess as the final solution. > > def Traug (v): > ? ?err=0 > ? ?Traug_RHOB = v[0] + v[1]*(z_den_s/3125.0)**v[2] > ? ?count =arange(size(den_s)) > ? ?for i in count: > ? ? ? ?err += abs(den_s[i] - Traug_RHOB [i]) > ? ?return err my first guess is that fmin doesn't like the abs, you could try with a quadratic loss instead to see whether it works and whether there are other problems If den_s[i] is an array, then you could vectorize the loop Josef > > vO=[1.6, 0.5, 0.35] > t=fmin(Traug,vO ) > a=t[0]; b=t[1]; c=t[2]; > > Code gives me a=1.6, b=0.5, c=0.35, which is my initial guess. > > What is wrong? Many thanks in advance. > > Regards, > Rishi > > > -- > View this message in context: http://old.nabble.com/Error-using-fmin-tp33370870p33370870.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From barbara.padova at gmail.com Wed Feb 22 03:40:11 2012 From: barbara.padova at gmail.com (barbara.padova) Date: Wed, 22 Feb 2012 00:40:11 -0800 (PST) Subject: [SciPy-User] Scipy' Delaunay cource code Message-ID: <5ca7e438-00ec-4bc6-ba00-2324762bb1ca@eb6g2000vbb.googlegroups.com> I'm searching in ...\Python27\Lib\site-packages\scipy\spatial the scipy's Dealunay source code, but I'm unable to find. Does anyone know where I can find it? Thanks, Barbara Padova From hjalti at vatnaskil.is Wed Feb 22 08:49:34 2012 From: hjalti at vatnaskil.is (=?iso-8859-1?Q?Hjalti_Sigurj=F3nsson?=) Date: Wed, 22 Feb 2012 13:49:34 -0000 Subject: [SciPy-User] [SciPy-user] Error using fmin In-Reply-To: <33370870.post@talk.nabble.com> References: <33370870.post@talk.nabble.com> Message-ID: Seems that z_den_s is not defined In [20]: Traug(vO) --------------------------------------------------------------------------- NameError Traceback (most recent call last) /home/hjalti/ in () ----> 1 Traug(vO) /home/hjalti/ in Traug(v) NameError: global name 'z_den_s' is not defined -----Original Message----- From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of RDX Sent: 22. febr?ar 2012 13:42 To: scipy-user at scipy.org Subject: [SciPy-User] [SciPy-user] Error using fmin I am using fmin and it returns initial guess as the final solution. def Traug (v): err=0 Traug_RHOB = v[0] + v[1]*(z_den_s/3125.0)**v[2] count =arange(size(den_s)) for i in count: err += abs(den_s[i] - Traug_RHOB [i]) return err vO=[1.6, 0.5, 0.35] t=fmin(Traug,vO ) a=t[0]; b=t[1]; c=t[2]; Code gives me a=1.6, b=0.5, c=0.35, which is my initial guess. What is wrong? Many thanks in advance. Regards, Rishi -- View this message in context: http://old.nabble.com/Error-using-fmin-tp33370870p33370870.html Sent from the Scipy-User mailing list archive at Nabble.com. _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From robert.kern at gmail.com Wed Feb 22 09:18:16 2012 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 22 Feb 2012 14:18:16 +0000 Subject: [SciPy-User] Scipy' Delaunay cource code In-Reply-To: <5ca7e438-00ec-4bc6-ba00-2324762bb1ca@eb6g2000vbb.googlegroups.com> References: <5ca7e438-00ec-4bc6-ba00-2324762bb1ca@eb6g2000vbb.googlegroups.com> Message-ID: On Wed, Feb 22, 2012 at 08:40, barbara.padova wrote: > I'm searching in ...\Python27\Lib\site-packages\scipy\spatial the > scipy's Dealunay source code, but I'm unable to find. > Does anyone know where I can find it? The installed package does not have any C sources, and the Delaunay triangulation code is in C. You will need to look in the scipy/spatial/ directory from the source distribution or a git checkout of the development sources. http://pypi.python.org/packages/source/s/scipy/scipy-0.10.0.zip https://github.com/scipy/scipy/tree/master/scipy/spatial -- Robert Kern From bjorn.burr.nyberg at gmail.com Wed Feb 22 11:55:55 2012 From: bjorn.burr.nyberg at gmail.com (Bjorn Nyberg) Date: Wed, 22 Feb 2012 17:55:55 +0100 Subject: [SciPy-User] Focal Majority In-Reply-To: References: Message-ID: <93C6338C-311B-49C3-BB21-97AC762A515D@gmail.com> Thanks Aronne, I only had half an hour or so to play around with it but it certainty looks promising. Im going to spend some more time on that over the weekend when im free... especially to understand how the ranking is being calculated. I have to ask though, as I am using this for GIS purposes is there an easy way to convert the Bool format into a 0,1 integer - i.e. needed to convert the array to raster format (arcpy.ArrayToRaster....). Regards, Nyberg On Feb 21, 2012, at 19:58 PM, Aronne Merrelli wrote: > > > > I don't think a convolution would work. A convolution is really just a weighted sum, so I can't see a way to mimic a sort or conditional that way. > > But, I think you can do this with scipy.ndimage.rank_filter. If you want 5 cells with the same value, it should be equivalent to checking if the first and fifth ranked elements are the same (or second and sixth, etc...). So a loop through the window size, combining rank_filter calls, should do this. Definitely double check me on this - I'm not 100% sure it is doing the the correct thing, and it probably isn't doing what you want at the edges. If this is not fast enough, then I would consider writing a "brute force" loop in Cython to make it fast. > > In [90]: z > Out[90]: > array([[1, 1, 0, 8], > [8, 1, 3, 1], > [3, 1, 1, 2], > [3, 1, 4, 5]]) > > In [91]: zmask = np.zeros(z.shape, bool) > > In [92]: for n in range(4): > zmask = np.logical_or(zmask, rank_filter(z,n,3)==rank_filter(z,(n-5),3)) > > In [93]: zmask > Out[93]: > array([[ True, True, False, False], > [ True, True, True, False], > [False, False, True, False], > [ True, False, False, False]], dtype=bool) > > > HTH, > Aronne > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From aronne.merrelli at gmail.com Wed Feb 22 13:06:04 2012 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Wed, 22 Feb 2012 12:06:04 -0600 Subject: [SciPy-User] Focal Majority In-Reply-To: <93C6338C-311B-49C3-BB21-97AC762A515D@gmail.com> References: <93C6338C-311B-49C3-BB21-97AC762A515D@gmail.com> Message-ID: On Wed, Feb 22, 2012 at 10:55 AM, Bjorn Nyberg wrote: > Thanks Aronne, > > I only had half an hour or so to play around with it but it certainty > looks promising. Im going to spend some more time on that over the weekend > when im free... especially to understand how the ranking is being > calculated. > It looks like it just sorts the elements within the window, and then pulls out the requested item, by the rank you specify. So, this is not terribly efficient since you will be sorting each window 10 times, while if you wrote it yourself you could sort each window once. But I'm guessing this would be faster than writing a pure NumPy/python loop, and the code is much simpler. BTW I think there is a typo in what I sent before, it should be: for n in range(5): zmask = np.logical_or(zmask, rank_filter(z,n,3)==rank_filter(z,(n-5),3)) Because you need to check 5 pairs of the sorted elements - [0, 4], [1, 5], ... [4, 8], not 4 pairs as I wrote before. It might be clearer to write for n in range(5): zmask = np.logical_or(zmask, rank_filter(z,n,3)==rank_filter(z,n+4,3)) You'd probably want to derive those constants (5,n+4, etc) in terms of the window size and the number of elements that need to be equal. I'm sure if I did, the result I would be off by one, as I was before, so I will let you figure that part out =) > I have to ask though, as I am using this for GIS purposes is there an easy > way to convert the Bool format into a 0,1 integer - i.e. needed to convert > the array to raster format (arcpy.ArrayToRaster....). > > > zmask.astype() would change it to whatever you need: In [47]: array([True, False, True]) Out[47]: array([ True, False, True], dtype=bool) In [48]: array([True, False, True]).astype(int) Out[48]: array([1, 0, 1]) Obviously that will make the memory footprint a lot bigger - I guess try out whichever smallest integer works with arcpy? (I have never used package) Aronne -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.friedland at gmail.com Wed Feb 22 15:26:12 2012 From: greg.friedland at gmail.com (Greg Friedland) Date: Wed, 22 Feb 2012 12:26:12 -0800 Subject: [SciPy-User] Confidence interval for bounded minimization Message-ID: Hi, Is it possible to calculate asymptotic confidence intervals for any of the bounded minimization algorithms? As far as I can tell they don't return the Hessian; that's including the new 'minimize' function which seemed like it might. Thanks. From josef.pktd at gmail.com Wed Feb 22 15:48:37 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 22 Feb 2012 15:48:37 -0500 Subject: [SciPy-User] Confidence interval for bounded minimization In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 3:26 PM, Greg Friedland wrote: > Hi, > > Is it possible to calculate asymptotic confidence intervals for any of > the bounded minimization algorithms? As far as I can tell they don't > return the Hessian; that's including the new 'minimize' function which > seemed like it might. If the parameter ends up at the bounds, then the standard statistics doesn't apply. The Hessian is based on a local quadratic approximation, which doesn't work if part of the local neigborhood is out of bounds. There is some special statistics for this, but so far I have seen only the description how GAUSS handles it. In statsmodels we use in some cases the bounds, or a transformation, just to keep the optimizer in the required range, and we assume we get an interior solution. In this case, it is possible to use the standard calculations, the easiest is to use the local minimum that the constraint or transformed optimizer found and use it as starting value for an unconstrained optimization where we can get the Hessian (or just calculate the Hessian based on the original objective function). Josef > > Thanks. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From njs at pobox.com Wed Feb 22 17:02:42 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 22 Feb 2012 22:02:42 +0000 Subject: [SciPy-User] Confidence interval for bounded minimization In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 8:48 PM, wrote: > On Wed, Feb 22, 2012 at 3:26 PM, Greg Friedland > wrote: >> Hi, >> >> Is it possible to calculate asymptotic confidence intervals for any of >> the bounded minimization algorithms? As far as I can tell they don't >> return the Hessian; that's including the new 'minimize' function which >> seemed like it might. > > If the parameter ends up at the bounds, then the standard statistics > doesn't apply. The Hessian is based on a local quadratic > approximation, which doesn't work if part of the local neigborhood is > out of bounds. > There is some special statistics for this, but so far I have seen only > the description how GAUSS handles it. > > In statsmodels we use in some cases the bounds, or a transformation, > just to keep the optimizer in the required range, and we assume we get > an interior solution. In this case, it is possible to use the standard > calculations, the easiest is to use the local minimum that the > constraint or transformed optimizer found and use it as starting value > for an unconstrained optimization where we can get the Hessian (or > just calculate the Hessian based on the original objective function). Some optimizers compute the Hessian internally. In those cases, it would be nice to have a way to ask them to somehow return that value instead of throwing it away. I haven't used Matlab in a while, but I remember running into this as a standard feature at some point, and it was quite nice. Especially when working with a problem where each computation of the Hessian requires an hour or so of computing time. -- Nathaniel From josef.pktd at gmail.com Wed Feb 22 21:06:45 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 22 Feb 2012 21:06:45 -0500 Subject: [SciPy-User] Confidence interval for bounded minimization In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 5:02 PM, Nathaniel Smith wrote: > On Wed, Feb 22, 2012 at 8:48 PM, ? wrote: >> On Wed, Feb 22, 2012 at 3:26 PM, Greg Friedland >> wrote: >>> Hi, >>> >>> Is it possible to calculate asymptotic confidence intervals for any of >>> the bounded minimization algorithms? As far as I can tell they don't >>> return the Hessian; that's including the new 'minimize' function which >>> seemed like it might. >> >> If the parameter ends up at the bounds, then the standard statistics >> doesn't apply. The Hessian is based on a local quadratic >> approximation, which doesn't work if part of the local neigborhood is >> out of bounds. >> There is some special statistics for this, but so far I have seen only >> the description how GAUSS handles it. >> >> In statsmodels we use in some cases the bounds, or a transformation, >> just to keep the optimizer in the required range, and we assume we get >> an interior solution. In this case, it is possible to use the standard >> calculations, the easiest is to use the local minimum that the >> constraint or transformed optimizer found and use it as starting value >> for an unconstrained optimization where we can get the Hessian (or >> just calculate the Hessian based on the original objective function). > > Some optimizers compute the Hessian internally. In those cases, it > would be nice to have a way to ask them to somehow return that value > instead of throwing it away. I haven't used Matlab in a while, but I > remember running into this as a standard feature at some point, and it > was quite nice. Especially when working with a problem where each > computation of the Hessian requires an hour or so of computing time. If it takes an hour to compute the Hessian, then don't compute it :) My guess, without checking, is that very few optimizers calculate the full Hessian. But for those that do calculate an approximation of the Hessian, it might be useful to get them, There was the discussion once to get the Lagrange Multipliers out of some optimizers, but the person (?) who asked for it found out that the numbers are so bad that they are not usable. I think Skipper uses in statsmodels almost only analytical derivatives or our own finite difference Hessian/derivatives, where it would be possible to store the last results, but we don't do it (yet). Josef > -- Nathaniel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From cjordan1 at uw.edu Thu Feb 23 00:09:57 2012 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Wed, 22 Feb 2012 21:09:57 -0800 Subject: [SciPy-User] Confidence interval for bounded minimization In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 2:02 PM, Nathaniel Smith wrote: > On Wed, Feb 22, 2012 at 8:48 PM, ? wrote: >> On Wed, Feb 22, 2012 at 3:26 PM, Greg Friedland >> wrote: >>> Hi, >>> >>> Is it possible to calculate asymptotic confidence intervals for any of >>> the bounded minimization algorithms? As far as I can tell they don't >>> return the Hessian; that's including the new 'minimize' function which >>> seemed like it might. >> >> If the parameter ends up at the bounds, then the standard statistics >> doesn't apply. The Hessian is based on a local quadratic >> approximation, which doesn't work if part of the local neigborhood is >> out of bounds. >> There is some special statistics for this, but so far I have seen only >> the description how GAUSS handles it. >> >> In statsmodels we use in some cases the bounds, or a transformation, >> just to keep the optimizer in the required range, and we assume we get >> an interior solution. In this case, it is possible to use the standard >> calculations, the easiest is to use the local minimum that the >> constraint or transformed optimizer found and use it as starting value >> for an unconstrained optimization where we can get the Hessian (or >> just calculate the Hessian based on the original objective function). > > Some optimizers compute the Hessian internally. In those cases, it > would be nice to have a way to ask them to somehow return that value > instead of throwing it away. I haven't used Matlab in a while, but I > remember running into this as a standard feature at some point, and it > was quite nice. Especially when working with a problem where each > computation of the Hessian requires an hour or so of computing time. > Are you talking about analytic or finite-difference gradients and hessians? I'd assumed that anything derived from finite difference estimations wouldn't give particularly good confidence intervals, but I've never needed them so I've never looked into it in detail. -Chris > -- Nathaniel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Thu Feb 23 01:10:02 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 23 Feb 2012 01:10:02 -0500 Subject: [SciPy-User] Confidence interval for bounded minimization In-Reply-To: References: Message-ID: On Thu, Feb 23, 2012 at 12:09 AM, Christopher Jordan-Squire wrote: > On Wed, Feb 22, 2012 at 2:02 PM, Nathaniel Smith wrote: >> On Wed, Feb 22, 2012 at 8:48 PM, ? wrote: >>> On Wed, Feb 22, 2012 at 3:26 PM, Greg Friedland >>> wrote: >>>> Hi, >>>> >>>> Is it possible to calculate asymptotic confidence intervals for any of >>>> the bounded minimization algorithms? As far as I can tell they don't >>>> return the Hessian; that's including the new 'minimize' function which >>>> seemed like it might. >>> >>> If the parameter ends up at the bounds, then the standard statistics >>> doesn't apply. The Hessian is based on a local quadratic >>> approximation, which doesn't work if part of the local neigborhood is >>> out of bounds. >>> There is some special statistics for this, but so far I have seen only >>> the description how GAUSS handles it. >>> >>> In statsmodels we use in some cases the bounds, or a transformation, >>> just to keep the optimizer in the required range, and we assume we get >>> an interior solution. In this case, it is possible to use the standard >>> calculations, the easiest is to use the local minimum that the >>> constraint or transformed optimizer found and use it as starting value >>> for an unconstrained optimization where we can get the Hessian (or >>> just calculate the Hessian based on the original objective function). >> >> Some optimizers compute the Hessian internally. In those cases, it >> would be nice to have a way to ask them to somehow return that value >> instead of throwing it away. I haven't used Matlab in a while, but I >> remember running into this as a standard feature at some point, and it >> was quite nice. Especially when working with a problem where each >> computation of the Hessian requires an hour or so of computing time. >> > > Are you talking about analytic or finite-difference gradients and > hessians? I'd assumed that anything derived from finite difference > estimations wouldn't give particularly good confidence intervals, but > I've never needed them so I've never looked into it in detail. statsmodels has both, all discrete models for example have analytical gradients and hessians. But for models with a complicated log-likelihood function, there isn't much choice, second derivatives with centered finite differences are ok, scipy.optimize.leastsq is not very good. statsmodels also has complex derivatives which are numerically pretty good but they cannot always be used. I think in most cases numerical derivatives will have a precision of a few decimals, which is more precise than all the other statistical assumptions, normality, law of large numbers, local definition of covariance matrix to calculate "large" confidence intervals, and so on. One problem is that choosing the step size depends on the data and model. numdifftools has adaptive calculations for the derivatives, but we are not using it anymore. Also, if the model is not well specified, then the lower precision of finite difference derivatives can hurt. For example, in ARMA models I had problems when there are too many lags specified, so that some roots should almost cancel. Skipper's implementation works better because he used a reparameterization that forces some nicer behavior. The only case in the econometrics literature that I know is that early GARCH models were criticized for using numerical derivatives even though analytical derivatives were available, some parameters were not well estimated, although different estimates produced essentially the same predictions (parameters are barely identified) Last defense: everyone else does it, maybe a few models more or less, and if the same statistical method is used, then the results usually agree pretty well. (But if different methods are used, for example initial conditions are treated differently in time series analysis, then the differences are usually much larger. Something like: I don't worry about numerical problems at the 5th or 6th decimal if I cannot figure out what these guys are doing with their first and second decimal.) (maybe more than anyone wants to know.) Josef . > > -Chris > > >> -- Nathaniel >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Thu Feb 23 01:23:44 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 23 Feb 2012 01:23:44 -0500 Subject: [SciPy-User] Confidence interval for bounded minimization In-Reply-To: References: Message-ID: On Thu, Feb 23, 2012 at 1:10 AM, wrote: > On Thu, Feb 23, 2012 at 12:09 AM, Christopher Jordan-Squire > wrote: >> On Wed, Feb 22, 2012 at 2:02 PM, Nathaniel Smith wrote: >>> On Wed, Feb 22, 2012 at 8:48 PM, ? wrote: >>>> On Wed, Feb 22, 2012 at 3:26 PM, Greg Friedland >>>> wrote: >>>>> Hi, >>>>> >>>>> Is it possible to calculate asymptotic confidence intervals for any of >>>>> the bounded minimization algorithms? As far as I can tell they don't >>>>> return the Hessian; that's including the new 'minimize' function which >>>>> seemed like it might. >>>> >>>> If the parameter ends up at the bounds, then the standard statistics >>>> doesn't apply. The Hessian is based on a local quadratic >>>> approximation, which doesn't work if part of the local neigborhood is >>>> out of bounds. >>>> There is some special statistics for this, but so far I have seen only >>>> the description how GAUSS handles it. >>>> >>>> In statsmodels we use in some cases the bounds, or a transformation, >>>> just to keep the optimizer in the required range, and we assume we get >>>> an interior solution. In this case, it is possible to use the standard >>>> calculations, the easiest is to use the local minimum that the >>>> constraint or transformed optimizer found and use it as starting value >>>> for an unconstrained optimization where we can get the Hessian (or >>>> just calculate the Hessian based on the original objective function). >>> >>> Some optimizers compute the Hessian internally. In those cases, it >>> would be nice to have a way to ask them to somehow return that value >>> instead of throwing it away. I haven't used Matlab in a while, but I >>> remember running into this as a standard feature at some point, and it >>> was quite nice. Especially when working with a problem where each >>> computation of the Hessian requires an hour or so of computing time. >>> >> >> Are you talking about analytic or finite-difference gradients and >> hessians? I'd assumed that anything derived from finite difference >> estimations wouldn't give particularly good confidence intervals, but >> I've never needed them so I've never looked into it in detail. > > statsmodels has both, all discrete models for example have analytical > gradients and hessians. > > But for models with a complicated log-likelihood function, there isn't > much choice, second derivatives with centered finite differences are > ok, scipy.optimize.leastsq is not very good. statsmodels also has > complex derivatives which are numerically pretty good but they cannot > always be used. > > I think in most cases numerical derivatives will have a precision of a > few decimals, which is more precise than all the other statistical > assumptions, normality, law of large numbers, local definition of > covariance matrix to calculate "large" confidence intervals, and so > on. > > One problem is that choosing the step size depends on the data and > model. numdifftools has adaptive calculations for the derivatives, but > we are not using it anymore. > > Also, if the model is not well specified, then the lower precision of > finite difference derivatives can hurt. For example, in ARMA models I > had problems when there are too many lags specified, so that some > roots should almost cancel. Skipper's implementation works better > because he used a reparameterization that forces some nicer behavior. > > The only case in the econometrics literature that I know is that early > GARCH models were criticized for using numerical derivatives even > though analytical derivatives were available, some parameters were not > well estimated, although different estimates produced essentially the > same predictions (parameters are barely identified) > > Last defense: everyone else does it, maybe a few models more or less, > and if the same statistical method is used, then the results usually > agree pretty well. > (But if different methods are used, for example initial conditions are > treated differently in time series analysis, then the differences are > usually much larger. Something like: I don't worry about numerical > problems at the 5th or 6th decimal if I cannot figure out what these > guys are doing with their first and second decimal.) > > (maybe more than anyone wants to know.) In case it wasn't clear: analytical derivatives are of course much better, and I would be glad if the scipy.stats.distributions or sympy had the formulas for the derivatives of the log-likelihood functions for the main distributions. (but it's work) Josef > > Josef > . > >> >> -Chris >> >> >>> -- Nathaniel >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From smokefloat at gmail.com Thu Feb 23 06:04:21 2012 From: smokefloat at gmail.com (David Hutto) Date: Thu, 23 Feb 2012 06:04:21 -0500 Subject: [SciPy-User] parsing a wave file Message-ID: Hi, I'm using scypy 0.10.1RC2 amd 64 with python 2.7. I'm attempting to parse a wav file to access the data chunks to show the values for use in an oscilloscope(to know the intended usage, and maybe a better way to go about the solution). The following code: . #################### f = Sndfile(r'c:\Users\david\test01.wav', 'r') fs = f.samplerate nc = f.channels enc = f.encoding data = f.read_frames(1000) frame_amount = 1000 data_float = f.read_frames(frame_amount, dtype=np.float32) for i in range(0,frame_amount,1): print data_float[i] ############## returns data_float[i] in the form: [-1,0.990988] [.08545,-0.009988] etc. My question is, is this the portion of data I'm parsing for(I'm almost positive it's not), or is there another data chunk? The graphing of the given data displays nothing close to the supplied current being recorded in the wav file. Any suggestions as to what I'm parsing in the wrong way, or better solutions than the above? Thanks, David From warren.weckesser at enthought.com Thu Feb 23 08:26:14 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 23 Feb 2012 07:26:14 -0600 Subject: [SciPy-User] parsing a wave file In-Reply-To: References: Message-ID: On Thu, Feb 23, 2012 at 5:04 AM, David Hutto wrote: > Hi, > I'm using scypy 0.10.1RC2 amd 64 with python 2.7. > I'm attempting to parse a wav file to access the data chunks to show > the values for use > in an oscilloscope(to know the intended usage, and maybe a better way > to go about the solution). > > The following code: > . > > #################### > f = Sndfile(r'c:\Users\david\test01.wav', 'r') > fs = f.samplerate > nc = f.channels > enc = f.encoding > data = f.read_frames(1000) > frame_amount = 1000 > data_float = f.read_frames(frame_amount, dtype=np.float32) > for i in range(0,frame_amount,1): > print data_float[i] > ############## > returns data_float[i] in the form: > [-1,0.990988] > [.08545,-0.009988] > etc. > > My question is, is this the portion of data I'm parsing for(I'm almost > positive it's not), or is there another data chunk? The graphing of > the given data displays nothing close to the supplied current being > recorded in the wav file. > > Any suggestions as to what I'm parsing in the wrong way, or better > solutions than the above? > > > Can you use the wavfile module in scipy.io? E.g. >>> from scipy.io import wavfile >>> fs, data = wavfile.read(r'c:\Users\david\test01.wav') Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Thu Feb 23 09:08:04 2012 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Thu, 23 Feb 2012 15:08:04 +0100 Subject: [SciPy-User] parsing a wave file In-Reply-To: References: Message-ID: Am 23.02.2012 um 12:04 schrieb David Hutto : > #################### > f = Sndfile(r'c:\Users\david\test01.wav', 'r') Sorry for the probably dumb question, but where does "Sndfile" originate from? A quick search in the Python 2.6.5 docs and in the online scipy docs yields nothing. > fs = f.samplerate > nc = f.channels > enc = f.encoding > data = f.read_frames(1000) > frame_amount = 1000 > data_float = f.read_frames(frame_amount, dtype=np.float32) > for i in range(0,frame_amount,1): > print data_float[i] > ############## > returns data_float[i] in the form: > [-1,0.990988] > [.08545,-0.009988] It looks to me as if the "Sndfile" instance already decomposes into channels. In the wave file, the channels are intermingled, one frame per channel for each time instance (I'm not fully sure if "frame" denotes the full chunk for one time instance or only one datum for one channel, part of the chunk). > Any suggestions as to what I'm parsing in the wrong way, or better > solutions than the above? I could imagine that the Sndfile class you're using takes the sampling width (i.e., the number of bytes per sample) from the dtype you're handing over, interpreting the originally possible integer-valued frames as float32's. This could also alter the alignment, so that full garbage would result, possibly explaining the apparently highly variable output (as far as I can judge from the two chunks you provided). But notice, this is only guessing, speculation so to speak. I've made positive experience with using the wave module from the standard library. I wrote a module (part of a larger sound analysis suite) to read arbitrary wave files as long as they are integer-valued. The module reads the file into a numpy array with sufficient speed (a few seconds for a 3' piece), and separates the channels. I guess it could be easily adapted to your use case (currently the channels are averaged in the end; you would just have to leave that step out). I would give it to you as a public domain module. For the scipy.io.wavfile module, that should work too, although I don't know if it already separates the channels or not?I never used it so far. Friedrich From kevin.gullikson at gmail.com Wed Feb 22 12:07:21 2012 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Wed, 22 Feb 2012 11:07:21 -0600 Subject: [SciPy-User] Focal Majority In-Reply-To: <93C6338C-311B-49C3-BB21-97AC762A515D@gmail.com> References: <93C6338C-311B-49C3-BB21-97AC762A515D@gmail.com> Message-ID: I'm not positive it is the best way, but you can just cast your array as an integer: >>> int(True) 1 >>> int(False) 0 Kevin Gullikson On Wed, Feb 22, 2012 at 10:55 AM, Bjorn Nyberg wrote: > Thanks Aronne, > > I only had half an hour or so to play around with it but it certainty > looks promising. Im going to spend some more time on that over the weekend > when im free... especially to understand how the ranking is being > calculated. > > I have to ask though, as I am using this for GIS purposes is there an easy > way to convert the Bool format into a 0,1 integer - i.e. needed to convert > the array to raster format (arcpy.ArrayToRaster....). > > Regards, > Nyberg > > On Feb 21, 2012, at 19:58 PM, Aronne Merrelli wrote: > > > > > > > > > I don't think a convolution would work. A convolution is really just a > weighted sum, so I can't see a way to mimic a sort or conditional that way. > > > > But, I think you can do this with scipy.ndimage.rank_filter. If you want > 5 cells with the same value, it should be equivalent to checking if the > first and fifth ranked elements are the same (or second and sixth, etc...). > So a loop through the window size, combining rank_filter calls, should do > this. Definitely double check me on this - I'm not 100% sure it is doing > the the correct thing, and it probably isn't doing what you want at the > edges. If this is not fast enough, then I would consider writing a "brute > force" loop in Cython to make it fast. > > > > In [90]: z > > Out[90]: > > array([[1, 1, 0, 8], > > [8, 1, 3, 1], > > [3, 1, 1, 2], > > [3, 1, 4, 5]]) > > > > In [91]: zmask = np.zeros(z.shape, bool) > > > > In [92]: for n in range(4): > > zmask = np.logical_or(zmask, > rank_filter(z,n,3)==rank_filter(z,(n-5),3)) > > > > In [93]: zmask > > Out[93]: > > array([[ True, True, False, False], > > [ True, True, True, False], > > [False, False, True, False], > > [ True, False, False, False]], dtype=bool) > > > > > > HTH, > > Aronne > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaakko.luttinen at aalto.fi Thu Feb 23 09:49:01 2012 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Thu, 23 Feb 2012 16:49:01 +0200 Subject: [SciPy-User] Mixing arrays, matrices and sparse matrices Message-ID: <4F4651DD.6040402@aalto.fi> Hi! I am trying to work without the matrix class in order to avoid problems, but it is difficult because mixing arrays and sparse matrices results in matrices. I believe that NumPy/SciPy works incorrectly when using arrays, matrices and sparse matrices mixed. I thought that results would be arrays rather than matrices because arrays are more general and the result may require more than two dimensions. For instance, I think all the following operations should return arrays: array.dot(array) = array matrix.dot(array) = matrix sparse.dot(array) = array array + matrix = matrix array + sparse = matrix On the other hand, sparse.multiply(array) should return sparse (because the element-wise product makes the result sparse), but there are no N-dimensional sparse arrays.. At the moment, sparse.multiply(array) returns a dense matrix, which is not good in my opinion - either return an array or a sparse matrix. I think it would be easiest if dense results were always given as arrays when there are arrays involved. Can anyone help? Should I create a ticket? Regards, Jaakko From lamblinp at iro.umontreal.ca Thu Feb 23 17:41:13 2012 From: lamblinp at iro.umontreal.ca (Pascal Lamblin) Date: Thu, 23 Feb 2012 23:41:13 +0100 Subject: [SciPy-User] Announcing Theano 0.5 Message-ID: <20120223224113.GA26872@bob.blip.be> =========================== Announcing Theano 0.5 =========================== This is a major version, with lots of new features, bug fixes, and some interface changes (deprecated or potentially misleading features were removed). Upgrading to Theano 0.5 is recommended for everyone, but you should first make sure that your code does not raise deprecation warnings with Theano 0.4.1. Otherwise, in one case the results can change. In other cases, the warnings are turned into errors (see below for details). For those using the bleeding edge version in the git repository, we encourage you to update to the `0.5` tag. If you have updated to 0.5rc1 or 0.5rc2, you are highly encouraged to update to 0.5, as some bugs introduced in those versions have now been fixed, see items marked with '#' in the lists below. What's New ---------- Highlight: * Moved to github: http://github.com/Theano/Theano/ * Old trac ticket moved to assembla ticket: http://www.assembla.com/spaces/theano/tickets * Theano vision: http://deeplearning.net/software/theano/introduction.html#theano-vision (Many people) * Theano with GPU works in some cases on Windows now. Still experimental. (Sebastian Urban) * Faster dot() call: New/Better direct call to cpu and gpu ger, gemv, gemm and dot(vector, vector). (James, Fr?d?ric, Pascal) * C implementation of Alloc. (James, Pascal) * theano.grad() now also work with sparse variable. (Arnaud) * Macro to implement the Jacobian/Hessian with theano.tensor.{jacobian,hessian} (Razvan) * See the Interface changes. Interface Behavior Changes: * The current default value of the parameter axis of theano.{max,min,argmax,argmin,max_and_argmax} is now the same as numpy: None. i.e. operate on all dimensions of the tensor. (Fr?d?ric Bastien, Olivier Delalleau) (was deprecated and generated a warning since Theano 0.3 released Nov. 23rd, 2010) * The current output dtype of sum with input dtype [u]int* is now always [u]int64. You can specify the output dtype with a new dtype parameter to sum. The output dtype is the one using for the summation. There is no warning in previous Theano version about this. The consequence is that the sum is done in a dtype with more precision than before. So the sum could be slower, but will be more resistent to overflow. This new behavior is the same as numpy. (Olivier, Pascal) # When using a GPU, detect faulty nvidia drivers. This was detected when running Theano tests. Now this is always tested. Faulty drivers results in in wrong results for reduce operations. (Frederic B.) Interface Features Removed (most were deprecated): * The string modes FAST_RUN_NOGC and STABILIZE are not accepted. They were accepted only by theano.function(). Use Mode(linker='c|py_nogc') or Mode(optimizer='stabilize') instead. * tensor.grad(cost, wrt) now always returns an object of the "same type" as wrt (list/tuple/TensorVariable). (Ian Goodfellow, Olivier) * A few tag.shape and Join.vec_length left have been removed. (Frederic) * The .value attribute of shared variables is removed, use shared.set_value() or shared.get_value() instead. (Frederic) * Theano config option "home" is not used anymore as it was redundant with "base_compiledir". If you use it, Theano will now raise an error. (Olivier D.) * scan interface changes: (Razvan Pascanu) * The use of `return_steps` for specifying how many entries of the output to return has been removed. Instead, apply a subtensor to the output returned by scan to select a certain slice. * The inner function (that scan receives) should return its outputs and updates following this order: [outputs], [updates], [condition]. One can skip any of the three if not used, but the order has to stay unchanged. Interface bug fix: * Rop in some case should have returned a list of one Theano variable, but returned the variable itself. (Razvan) New deprecation (will be removed in Theano 0.6, warning generated if you use them): * tensor.shared() renamed to tensor._shared(). You probably want to call theano.shared() instead! (Olivier D.) Bug fixes (incorrect results): * On CPU, if the convolution had received explicit shape information, they where not checked at runtime. This caused wrong result if the input shape was not the one expected. (Frederic, reported by Sander Dieleman) * Theoretical bug: in some case we could have GPUSum return bad value. We were not able to reproduce this problem * patterns affected ({0,1}*nb dim, 0 no reduction on this dim, 1 reduction on this dim): 01, 011, 0111, 010, 10, 001, 0011, 0101 (Frederic) * div by zero in verify_grad. This hid a bug in the grad of Images2Neibs. (James) * theano.sandbox.neighbors.Images2Neibs grad was returning a wrong value. The grad is now disabled and returns an error. (Frederic) * An expression of the form "1 / (exp(x) +- constant)" was systematically matched to "1 / (exp(x) + 1)" and turned into a sigmoid regardless of the value of the constant. A warning will be issued if your code was affected by this bug. (Olivier, reported by Sander Dieleman) * When indexing into a subtensor of negative stride (for instance, x[a:b:-1][c]), an optimization replacing it with a direct indexing (x[d]) used an incorrect formula, leading to incorrect results. (Pascal, reported by Razvan) * The tile() function is now stricter in what it accepts to allow for better error-checking/avoiding nonsensical situations. The gradient has been disabled for the time being as it only implemented (incorrectly) one special case. The `reps` argument must be a constant (not a tensor variable), and must have the same length as the number of dimensions in the `x` argument; this is now checked. (David) # Fix a bug with Gemv and Ger on CPU, when used on vectors with negative strides. Data was read from incorrect (and possibly uninitialized) memory space. This bug was probably introduced in 0.5rc1. (Pascal L.) # The Theano flag "nvcc.flags" is now included in the hard part of the key. This mean that now we recompile all modules for each value of "nvcc.flags". A change in "nvcc.flags" used to be ignored for module that were already compiled. (Frederic B.) Scan fixes: * computing grad of a function of grad of scan (reported by Justin Bayer, fix by Razvan) before : most of the time crash, but could be wrong value with bad number of dimensions (so a visible bug) now : do the right thing. * gradient with respect to outputs using multiple taps (reported by Timothy, fix by Razvan) before : it used to return wrong values now : do the right thing. Note: The reported case of this bug was happening in conjunction with the save optimization of scan that give run time errors. So if you didn't manually disable the same memory optimization (number in the list4), you are fine if you didn't manually request multiple taps. * Rop of gradient of scan (reported by Timothy and Justin Bayer, fix by Razvan) before : compilation error when computing R-op now : do the right thing. * save memory optimization of scan (reported by Timothy and Nicolas BL, fix by Razvan) before : for certain corner cases used to result in a runtime shape error now : do the right thing. * Scan grad when the input of scan has sequences of different lengths. (Razvan, reported by Michael Forbes) * Scan.infer_shape now works correctly when working with a condition for the number of loops. In the past, it returned n_steps as the length, which is not always true. (Razvan) * Scan.infer_shape crash fix. (Razvan) New features: * AdvancedIncSubtensor grad defined and tested (Justin Bayer) * Adding 1D advanced indexing support to inc_subtensor and set_subtensor (James Bergstra) * tensor.{zeros,ones}_like now support the dtype param as numpy (Frederic) * Added configuration flag "exception_verbosity" to control the verbosity of exceptions (Ian) * theano-cache list: list the content of the theano cache (Frederic) * theano-cache unlock: remove the Theano lock (Olivier) * tensor.ceil_int_div to compute ceil(a / float(b)) (Frederic) * MaxAndArgMax.grad now works with any axis (The op supports only 1 axis) (Frederic) * used by tensor.{max,min,max_and_argmax} * tensor.{all,any} (Razvan) * tensor.roll as numpy: (Matthew Rocklin, David Warde-Farley) * Theano with GPU works in some cases on Windows now. Still experimental. (Sebastian Urban) * IfElse now allows to have a list/tuple as the result of the if/else branches. * They must have the same length and corresponding type (Razvan) * Argmax output dtype is now int64 instead of int32. (Olivier) * Added the element-wise operation arccos. (Ian) * Added sparse dot with dense grad output. (Yann Dauphin) * Optimized to Usmm and UsmmCscDense in some case (Yann) * Note: theano.dot and theano.sparse.structured_dot() always had a gradient with the same sparsity pattern as the inputs. The new theano.sparse.dot() has a dense gradient for all inputs. * GpuAdvancedSubtensor1 supports broadcasted dimensions. (Frederic) * TensorVariable.zeros_like() and SparseVariable.zeros_like() * theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.device_properties() (Frederic) * theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.mem_info() return free and total gpu memory (Frederic) * Theano flags compiledir_format. Keep the same default as before: compiledir_%(platform)s-%(processor)s-%(python_version)s. (Josh Bleecher Snyder) * We also support the "theano_version" substitution. * IntDiv c code (faster and allow this elemwise to be fused with other elemwise) (Pascal) * Internal filter_variable mechanism in Type. (Pascal, Ian) * Ifelse works on sparse. * It makes use of gpu shared variable more transparent with theano.function updates and givens parameter. * Added a_tensor.transpose(axes) axes is optional (James) * theano.tensor.transpose(a_tensor, kwargs) We where ignoring kwargs, now it is used as the axes. * a_CudaNdarray_object[*] = int, now works (Frederic) * tensor_variable.size (as numpy) computes the product of the shape elements. (Olivier) * sparse_variable.size (as scipy) computes the number of stored values. (Olivier) * sparse_variable[N, N] now works (Li Yao, Frederic) * sparse_variable[M:N, O:P] now works (Li Yao, Frederic, Pascal) M, N, O, and P can be Python int or scalar tensor variables, None, or omitted (sparse_variable[:, :M] or sparse_variable[:M, N:] work). * tensor.tensordot can now be moved to GPU (Sander Dieleman, Pascal, based on code from Tijmen Tieleman's gnumpy, http://www.cs.toronto.edu/~tijmen/gnumpy.html) # Many infer_shape implemented on sparse matrices op. (David W.F.) # Added theano.sparse.verify_grad_sparse to easily allow testing grad of sparse op. It support testing the full and structured gradient. # The keys in our cache now store the hash of constants and not the constant values themselves. This is significantly more efficient for big constant arrays. (Frederic B.) # 'theano-cache list' lists key files bigger than 1M (Frederic B.) # 'theano-cache list' prints an histogram of the number of keys per compiled module (Frederic B.) # 'theano-cache list' prints the number of compiled modules per op class (Frederic B.) # The Theano flag "nvcc.fastmath" is now also used for the cuda_ndarray.cu file. # Add the header_dirs to the hard part of the compilation key. This is currently used only by cuda, but if we use library that are only headers, this can be useful. (Frederic B.) # Alloc, GpuAlloc are not always pre-computed (constant_folding optimization) at compile time if all their inputs are constant. (Frederic B., Pascal L., reported by Sander Dieleman) # New Op tensor.sort(), wrapping numpy.sort (Hani Almousli) New optimizations: * AdvancedSubtensor1 reuses preallocated memory if available (scan, c|py_nogc linker) (Frederic) * dot22, dot22scalar work with complex. (Frederic) * Generate Gemv/Gemm more often. (James) * Remove scan when all computations can be moved outside the loop. (Razvan) * scan optimization done earlier. This allows other optimizations to be applied. (Frederic, Guillaume, Razvan) * exp(x) * sigmoid(-x) is now correctly optimized to the more stable form sigmoid(x). (Olivier) * Added Subtensor(Rebroadcast(x)) => Rebroadcast(Subtensor(x)) optimization. (Guillaume) * Made the optimization process faster. (James) * Allow fusion of elemwise when the scalar op needs support code. (James) * Better opt that lifts transpose around dot. (James) Crashes fixed: * T.mean crash at graph building time. (Ian) * "Interactive debugger" crash fix. (Ian, Frederic) * Do not call gemm with strides 0, some blas refuse it. (Pascal Lamblin) * Optimization crash with gemm and complex. (Frederic) * GPU crash with elemwise. (Frederic, some reported by Chris Currivan) * Compilation crash with amdlibm and the GPU. (Frederic) * IfElse crash. (Frederic) * Execution crash fix in AdvancedSubtensor1 on 32 bit computers. (Pascal) * GPU compilation crash on MacOS X. (Olivier) * Support for OSX Enthought Python Distribution 7.x. (Graham Taylor, Olivier) * When the subtensor inputs had 0 dimensions and the outputs 0 dimensions. (Frederic) * Crash when the step to subtensor was not 1 in conjunction with some optimization. (Frederic, reported by Olivier Chapelle) * Runtime crash related to an optimization with subtensor of alloc (reported by Razvan, fixed by Frederic) * Fix dot22scalar cast of integer scalars (Justin Bayer, Fr?d?ric, Olivier) * Fix runtime crash in gemm, dot22. FB * Fix on 32bits computer: make sure all shape are int64.(Olivier) * Fix to deque on python 2.4 (Olivier) * Fix crash when not using c code (or using DebugMode) (not used by default) with numpy 1.6*. Numpy has a bug in the reduction code that made it crash. (Pascal) * Crashes of blas functions (Gemv on CPU; Ger, Gemv and Gemm on GPU) when matrices had non-unit stride in both dimensions (CPU and GPU), or when matrices had negative strides (GPU only). In those cases, we are now making copies. (Pascal) # More cases supported in AdvancedIncSubtensor1. (Olivier D.) # Fix crash when a broadcasted constant was used as input of an elemwise Op and needed to be upcasted to match the op's output. (Reported by John Salvatier, fixed by Pascal L.) # Fixed a memory leak with shared variable (we kept a pointer to the original value) (Ian G.) Known bugs: * CAReduce with nan in inputs don't return the good output (`Ticket `_). * This is used in tensor.{max,mean,prod,sum} and in the grad of PermuteRowElements. Sandbox: * cvm interface more consistent with current linker. (James) * Now all tests pass with the linker=cvm flags. * vm linker has a callback parameter. (James) * review/finish/doc: diag/extract_diag. (Arnaud Bergeron, Frederic, Olivier) * review/finish/doc: AllocDiag/diag. (Arnaud, Frederic, Guillaume) * review/finish/doc: MatrixInverse, matrix_inverse. (Razvan) * review/finish/doc: matrix_dot. (Razvan) * review/finish/doc: det (determinent) op. (Philippe Hamel) * review/finish/doc: Cholesky determinent op. (David) * review/finish/doc: ensure_sorted_indices. (Li Yao) * review/finish/doc: spectral_radius_boud. (Xavier Glorot) * review/finish/doc: sparse sum. (Valentin Bisson) * review/finish/doc: Remove0 (Valentin) * review/finish/doc: SquareDiagonal (Eric) Sandbox New features (not enabled by default): * CURAND_RandomStreams for uniform and normal (not picklable, GPU only) (James) * New sandbox.linalg.ops.pinv(pseudo-inverse) op (Razvan) Documentation: * Many updates. (Many people) * Updates to install doc on MacOS. (Olivier) * Updates to install doc on Windows. (David, Olivier) * Doc on the Rop function (Ian) * Added how to use scan to loop with a condition as the number of iteration. (Razvan) * Added how to wrap in Theano an existing python function (in numpy, scipy, ...). (Frederic) * Refactored GPU installation of Theano. (Olivier) Others: * Better error messages in many places. (Many people) * PEP8 fixes. (Many people) * Add a warning about numpy bug when using advanced indexing on a tensor with more than 2**32 elements (the resulting array is not correctly filled and ends with zeros). (Pascal, reported by David WF) * Added Scalar.ndim=0 and ScalarSharedVariable.ndim=0 (simplify code) (Razvan) * New min_informative_str() function to print graph. (Ian) * Fix catching of exception. (Sometimes we used to catch interrupts) (Frederic, David, Ian, Olivier) * Better support for utf string. (David) * Fix pydotprint with a function compiled with a ProfileMode (Frederic) * Was broken with change to the profiler. * Warning when people have old cache entries. (Olivier) * More tests for join on the GPU and CPU. (Frederic) * Do not request to load the GPU module by default in scan module. (Razvan) * Fixed some import problems. (Frederic and others) * Filtering update. (James) * On Windows, the default compiledir changed to be local to the computer/user and not transferred with roaming profile. (Sebastian Urban) * New theano flag "on_shape_error". Defaults to "warn" (same as previous behavior): it prints a warning when an error occurs when inferring the shape of some apply node. The other accepted value is "raise" to raise an error when this happens. (Frederic) * The buidbot now raises optimization/shape errors instead of just printing a warning. (Frederic) * better pycuda tests (Frederic) * check_blas.py now accept the shape and the number of iteration as parameter (Frederic) * Fix opt warning when the opt ShapeOpt is disabled (enabled by default) (Frederic) * More internal verification on what each op.infer_shape return. (Frederic, James) * Argmax dtype to int64 (Olivier) * Improved docstring and basic tests for the Tile Op (David). Reviewers (alphabetical order): * David, Frederic, Ian, James, Olivier, Razvan Download and Install -------------------- You can download Theano from http://pypi.python.org/pypi/Theano Installation instructions are available at http://deeplearning.net/software/theano/install.html Description ----------- Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy. Theano features: * tight integration with NumPy: a similar interface to NumPy's. numpy.ndarrays are also used internally in Theano-compiled functions. * transparent use of a GPU: perform data-intensive computations up to 140x faster than on a CPU (support for float32 only). * efficient symbolic differentiation: Theano can compute derivatives for functions of one or many inputs. * speed and stability optimizations: avoid nasty bugs when computing expressions such as log(1+ exp(x)) for large values of x. * dynamic C code generation: evaluate expressions faster. * extensive unit-testing and self-verification: includes tools for detecting and diagnosing bugs and/or potential problems. Theano has been powering large-scale computationally intensive scientific research since 2007, but it is also approachable enough to be used in the classroom (IFT6266 at the University of Montreal). Resources --------- About Theano: http://deeplearning.net/software/theano/ Theano-related projects: http://github.com/Theano/Theano/wiki/Related-projects About NumPy: http://numpy.scipy.org/ About SciPy: http://www.scipy.org/ Machine Learning Tutorial with Theano on Deep Architectures: http://deeplearning.net/tutorial/ Acknowledgments --------------- I would like to thank all contributors of Theano. For this particular release, many people have helped, notably (in alphabetical order): Hani Almousli, Fr?d?ric Bastien, Justin Bayer, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Yann Dauphin, Olivier Delalleau, Guillaume Desjardins, Sander Dieleman, Xavier Glorot, Ian Goodfellow, Philippe Hamel, Pascal Lamblin, Eric Laufer, Gr?goire Mesnil, Razvan Pascanu, Matthew Rocklin, Graham Taylor, Sebastian Urban, David Warde-Farley, and Yao Li. I would also like to thank users who submitted bug reports, notably: Nicolas Boulanger-Lewandowski, Olivier Chapelle, Michael Forbes, Timothy Lillicrap, and John Salvatier. Also, thank you to all NumPy and Scipy developers as Theano builds on their strengths. -- Pascal From smokefloat at gmail.com Fri Feb 24 06:26:02 2012 From: smokefloat at gmail.com (David Hutto) Date: Fri, 24 Feb 2012 06:26:02 -0500 Subject: [SciPy-User] parsing a wave file In-Reply-To: References: Message-ID: On Thu, Feb 23, 2012 at 9:08 AM, Friedrich Romstedt wrote: > Am 23.02.2012 um 12:04 schrieb David Hutto : >> #################### >> ? ? ? ? ? ?f = Sndfile(r'c:\Users\david\test01.wav', 'r') > > Sorry for the probably dumb question, but where does "Sndfile" originate from? An online example. ?A quick search in the Python 2.6.5 docs I'm using 2.7.2 and in the online scipy docs yields nothing. It was a .dll to access functions from I believe. > >> ? ? ? ?fs = f.samplerate >> ? ? ? ?nc = f.channels >> ? ? ? ?enc = f.encoding >> ? ? ? ?data = f.read_frames(1000) >> ? ? ? ?frame_amount = 1000 >> ? ? ? ?data_float = f.read_frames(frame_amount, dtype=np.float32) >> ? ? ? ?for i in range(0,frame_amount,1): >> ? ? ? ? ? ?print data_float[i] >> ? ?############## >> returns data_float[i] in the form: >> [-1,0.990988] >> [.08545,-0.009988] > > It looks to me as if the "Sndfile" instance already decomposes into channels. ?In the wave file, the channels are intermingled, one frame per channel for each time instance (I'm not fully sure if "frame" denotes the full chunk for one time instance or only one datum for one channel, part of the chunk). > >> Any suggestions as to what I'm parsing in the wrong way, or better >> solutions than the above? > > I could imagine that the Sndfile class you're using takes the sampling width (i.e., the number of bytes per sample) from the dtype you're handing over, interpreting the originally possible integer-valued frames as float32's. This could also alter the alignment, so that full garbage would result, possibly explaining the apparently highly variable output (as far as I can judge from the two chunks you provided). ?But notice, this is only guessing, speculation so to speak. > > I've made positive experience with using the wave module from the standard library. I wrote a module (part of a larger sound analysis suite) to I'd like to see that, or any quicker ways there might be of having the values of the device immediately, which I'm getting around to now, instead of having to wait for the wav file. read arbitrary wave files as long as they are integer-valued. The module reads the file into a numpy array with sufficient speed (a few seconds for a 3' piece), and separates the channels. I guess it could be easily adapted to your use case (currently the channels are averaged in the end; you would just have to leave that step out). I would give it to you as a public domain module. > > For the scipy.io.wavfile module, that should work too, although I don't know if it already separates the channels or not?I never used it so far. I changed it to import scipy.io.wavfile as wv for array_val in wv.read(r'c:\Users\david\test01.wav')[1]: print array_val[0],array_val[1] > > Friedrich > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From smokefloat at gmail.com Fri Feb 24 06:28:13 2012 From: smokefloat at gmail.com (David Hutto) Date: Fri, 24 Feb 2012 06:28:13 -0500 Subject: [SciPy-User] parsing a wave file In-Reply-To: References: Message-ID: On Thu, Feb 23, 2012 at 8:26 AM, Warren Weckesser wrote: > > > On Thu, Feb 23, 2012 at 5:04 AM, David Hutto wrote: >> >> Hi, >> I'm using scypy 0.10.1RC2 amd 64 with python 2.7. >> I'm attempting to parse a wav file to access the data chunks to show >> the values for use >> in an oscilloscope(to know the intended usage, and maybe ?a better way >> to go about the solution). >> >> The following code: >> . >> >> #################### >> ? ? ? ? ? ? ? ?f = Sndfile(r'c:\Users\david\test01.wav', 'r') >> ? ? ? ? ? ? ? ?fs = f.samplerate >> ? ? ? ? ? ? ? ?nc = f.channels >> ? ? ? ? ? ? ? ?enc = f.encoding >> ? ? ? ? ? ? ? ?data = f.read_frames(1000) >> ? ? ? ? ? ? ? ?frame_amount = 1000 >> ? ? ? ? ? ? ? ?data_float = f.read_frames(frame_amount, dtype=np.float32) >> ? ? ? ? ? ? ? ?for i in range(0,frame_amount,1): >> ? ? ? ? ? ? ? ? ? ? ? ?print data_float[i] >> ? ? ? ?############## >> returns data_float[i] in the form: >> [-1,0.990988] >> [.08545,-0.009988] >> etc. >> >> My question is, is this the portion of data I'm parsing for(I'm almost >> positive it's not), or is there another data chunk? The graphing of >> the given data displays nothing close to the supplied current being >> recorded in the wav file. >> >> Any suggestions as to what I'm parsing in the wrong way, or better >> solutions than the above? >> >> > > > Can you use the? wavfile module in scipy.io? Yes, thanks for pointing that out. E.g. > >>>> from scipy.io import wavfile >>>> fs, data = wavfile.read(r'c:\Users\david\test01.wav') > > > Warren > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From d.witherick at ucl.ac.uk Fri Feb 24 06:48:46 2012 From: d.witherick at ucl.ac.uk (Dugan Witherick) Date: Fri, 24 Feb 2012 11:48:46 +0000 Subject: [SciPy-User] Scipy test failure when building on Scientific Linux 6.0 Message-ID: I'm trying to build numpy (1.6.1) and scipy (0.10.1rc2) on Scientific Linux 6.0. I've successfully managed to build both packages from source using python setup.py config_fc --fcompiler=gnu95 install but while numpy passes its tests, scipy doesn't: >>> scipy.test() Running unit tests for scipy NumPy version 1.6.1 NumPy is installed in /usr/lib64/python2.6/site-packages/numpy SciPy version 0.10.1rc2 SciPy is installed in /usr/lib64/python2.6/site-packages/scipy Python version 2.6.6 (r266:84292, May 20 2011, 16:42:11) [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] nose version 0.10.4 ---SKIPPED---- ====================================================================== ERROR: test_qhull.TestTriangulation.test_pathological ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib64/python2.6/site-packages/scipy/spatial/tests/test_qhull.py", line 216, in test_pathological assert_equal(tri.points[tri.vertices].max(), ValueError: zero-size array to maximum.reduce without identity ====================================================================== FAIL: test_interpnd.TestCloughTocher2DInterpolator.test_dense ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib64/python2.6/site-packages/scipy/interpolate/tests/test_interpnd.py", line 183, in test_dense err_msg="Function %d" % j) File "/usr/lib64/python2.6/site-packages/scipy/interpolate/tests/test_interpnd.py", line 132, in _check_accuracy assert_allclose(a, b, **kw) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 1168, in assert_allclose verbose=verbose, header=header) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 605, in assert_array_compare chk_same_position(x_id, y_id, hasval='nan') File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 588, in chk_same_position raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.01, atol=0.005 Function 0 x and y nan location mismatch: x: array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,... y: array([ 3.66796999e-02, 1.91605573e-01, 6.08362261e-01, 7.64324844e-02, 9.18031021e-01, 1.28033199e-01, 4.67121584e-01, 1.37085621e-01, 2.53092671e-01,... ---SKIPPED several other fails--- ---------------------------------------------------------------------- Ran 5102 tests in 80.529s FAILED (KNOWNFAIL=13, SKIP=35, errors=1, failures=19) numpy/scipy are being built against lapack (3.2.1), blas (3.2.1) and atlas (3.8.3) from the standard Scientific Linux repository. I would appreciate any advice/suggestions on where I might be going wrong. Thanks, Dugan -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Fri Feb 24 08:12:39 2012 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Fri, 24 Feb 2012 14:12:39 +0100 Subject: [SciPy-User] parsing a wave file In-Reply-To: References: Message-ID: Am 24. Februar 2012 12:26 schrieb David Hutto : > On Thu, Feb 23, 2012 at 9:08 AM, Friedrich Romstedt >> Sorry for the probably dumb question, but where does "Sndfile" originate from? > > An online example. OK >> A quick search in the Python 2.6.5 docs > > I'm using 2.7.2 > >> and in the online scipy docs yields nothing. > > It was a .dll to access functions from I believe. ok, so the question wasn't that dumb it seems :-) >> I've made positive experience with using the wave module from the standard library. I wrote a module (part of a larger sound analysis suite) to > > I'd like to see that, or any quicker ways there might be of having the > values of the device immediately, which I'm getting around to now, > instead of having to wait for the wav file. Alright, I'll send the file attached. Unchanged, as I don't know what you specifically will need. The inportant function is the WavfileModel.read() method. Adapt it to whatever you need. It is called a model because it is part of a model?view?controller framework. >> For the scipy.io.wavfile module, that should work too, although I don't know if it already separates the channels or not?I never used it so far. > > I changed it to > > import scipy.io.wavfile as wv > ? ? ? ?for array_val in wv.read(r'c:\Users\david\test01.wav')[1]: > ? ? ? ? ? ? ? ?print array_val[0],array_val[1] FWIW, was there any important outcome of that alteration? :-) I realise the docs on that file I'm sending attached are not quite up to date in some detail questions. You will see yourself. As I said, treat the file as public domain and don't let us bother ourselves with licensing questions. The file is of course by me. For immediate recording, I can name: pygame ? a Library for game programming, which can read many file formats, but is not easy to compile, and most notably you have access to the frames AFAIK only for a subset of the files (e.g. ogg vorbis IIRC). I would expect that it can also record directly, but I'm not sure. Check yourself: http://www.pygame.org/ pyaudio ? some "alpha" software for recording and playing back directly from and to the sound device, I used it once to playback some sounds. http://people.csail.mit.edu/hubert/pyaudio/ One warning: If you're dealing with Tkinter, make sure you don't call back into Tkinter methods from the playback or recording threaad, assumed you're using multithreading. At least on OS X I found the application become really unstable by this, meaning it might hang unexpected and unreproducibly. It's really nasty. Use some messanging system to some Tkinter polling thread. I can give more details if you need them. The "Tkinter" thread would be that one that imported Tkinter. Have fun! Friedrich -------------- next part -------------- A non-text attachment was scrubbed... Name: wavfile.py Type: application/octet-stream Size: 5114 bytes Desc: not available URL: From cournape at gmail.com Fri Feb 24 10:15:59 2012 From: cournape at gmail.com (David Cournapeau) Date: Fri, 24 Feb 2012 10:15:59 -0500 Subject: [SciPy-User] parsing a wave file In-Reply-To: References: Message-ID: On Thu, Feb 23, 2012 at 9:08 AM, Friedrich Romstedt wrote: > Am 23.02.2012 um 12:04 schrieb David Hutto : >> #################### >> ? ? ? ? ? ?f = Sndfile(r'c:\Users\david\test01.wav', 'r') > > Sorry for the probably dumb question, but where does "Sndfile" originate from? ?A quick search in the Python 2.6.5 docs and in the online scipy docs yields nothing. Most likely coming from the scikits.audiolab. >> ? ? ? ?fs = f.samplerate >> ? ? ? ?nc = f.channels >> ? ? ? ?enc = f.encoding >> ? ? ? ?data = f.read_frames(1000) >> ? ? ? ?frame_amount = 1000 >> ? ? ? ?data_float = f.read_frames(frame_amount, dtype=np.float32) >> ? ? ? ?for i in range(0,frame_amount,1): >> ? ? ? ? ? ?print data_float[i] >> ? ?############## >> returns data_float[i] in the form: >> [-1,0.990988] >> [.08545,-0.009988] > > It looks to me as if the "Sndfile" instance already decomposes into channels. ?In the wave file, the channels are intermingled, one frame per channel for each time instance (I'm not fully sure if "frame" denotes the full chunk for one time instance or only one datum for one channel, part of the chunk). A frame contains one time point, and each time point contains up to M values, where M is the number of channels. > >> Any suggestions as to what I'm parsing in the wrong way, or better >> solutions than the above? > > I could imagine that the Sndfile class you're using takes the sampling width (i.e., the number of bytes per sample) from the dtype you're handing over, interpreting the originally possible integer-valued frames as float32's. This could also alter the alignment, so that full garbage would result, possibly explaining the apparently highly variable output (as far as I can judge from the two chunks you provided). ?But notice, this is only guessing, speculation so to speak. That's not how it works. The dtype in read_frames only affect the dtype of the output, but will work independently of the type used in the wavfile. It goes through libsndfile to read the wav file, and libsndfile is known to be extremely reliable (used in many profesional audio softwares, open source and proprietary). > > I've made positive experience with using the wave module from the standard library. I wrote a module (part of a larger sound analysis suite) to read arbitrary wave files as long as they are integer-valued. The module reads the file into a numpy array with sufficient speed (a few seconds for a 3' piece), and separates the channels. I guess it could be easily adapted to your use case (currently the channels are averaged in the end; you would just have to leave that step out). I would give it to you as a public domain module. While scipy.io/wave modules are fine, using scikits.audiolab will give access to many different formats, even broken ones generated by some softwares. I would say the only siginficant issue with scikits.audiolab is the dependency on libsndfile, and some people may not like the LGPL licensing for both libsndfile and scikits.audiolab. David From jeffalstott at gmail.com Thu Feb 23 21:15:33 2012 From: jeffalstott at gmail.com (Jeff Alstott) Date: Thu, 23 Feb 2012 21:15:33 -0500 Subject: [SciPy-User] fft not giving Hermitian output for real input Message-ID: I have a particular data set, d0, which is real: In [72]: d0 Out[72]: array([ 0.00907105, 0.0372916 , 0.01402867, ..., -0.04779497, -0.07054817, -0.0436582 ]) It doesn't produce Hermitian output from numpy.fft.fft. In [87]: y0 = fft.fft(d0) In [88]: y0 Out[88]: array([ 7.77156117e-14 +0.00000000e+00j, 6.89226454e-13 -1.56319402e-13j, 1.72140080e-13 +0.00000000e+00j, ..., -7.95807864e-13 -1.13686838e-13j, -3.41060513e-13 +1.13686838e-13j, 1.25055521e-12 -3.41060513e-13j]) In [89]: y0[1].conj()==y0[-1] Out[89]: False In [90]: y0[1].conj()==y0[-2] Out[90]: False Just to be double sure, I force d0 to be real. Same result. In [91]: y0 = fft.fft(real(d0)) In [92]: y0 Out[92]: array([ 7.77156117e-14 +0.00000000e+00j, 6.89226454e-13 -1.56319402e-13j, 1.72140080e-13 +0.00000000e+00j, ..., -7.95807864e-13 -1.13686838e-13j, -3.41060513e-13 +1.13686838e-13j, 1.25055521e-12 -3.41060513e-13j]) In [93]: y0[1].conj()==y0[-1] Out[93]: False In [94]: y0[1].conj()==y0[-2] Out[94]: False This behavior doesn't occur if I make a simple signal and fft it. In [63]: x = arange(10000) In [64]: z = sin(x) In [65]: q = fft.fft(z) In [66]: q Out[66]: array([ 1.93950541+0.j , 1.93950618+0.00020886j, 1.93950848+0.00041772j, ..., 1.93951232-0.00062658j, 1.93950848-0.00041772j, 1.93950618-0.00020886j]) Thoughts? -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at u.washington.edu Fri Feb 24 15:11:47 2012 From: skylar2 at u.washington.edu (Skylar Thompson) Date: Fri, 24 Feb 2012 12:11:47 -0800 Subject: [SciPy-User] RHEL6 build issues Message-ID: <4F47EF03.9090905@u.washington.edu> Hi, I'm trying to build scipy for x86_64 RHEL 6.2. I'm running into problems with linking at the end. I keep getting errors like this: /net/gs/vol3/software/modules-sw-test/gcc/4.6.2/Linux/RHEL6/x86_64/bin/gfortran -Wall -L/net/gs/vol3/software/modules-sw-test/python/1.5.1/Linux/RHEL6/x86_64//lib/ -L/net/gs/vol3/software/modules-sw-test/ATLAS/3.9.63/Linux/RHEL6/x86_64//lib/ -L/net/gs/vol3/software/modules-sw-test/gcc/4.6.2/Linux/RHEL6/x86_64/lib64/ -L/net/gs/vol3/software/modules-sw-test/gcc/4.6.2/Linux/RHEL6/x86_64/lib/ -L/net/gs/vol3/software/modules-sw-test/gmp/5.0.2/Linux/RHEL6/x86_64//lib/ -L/net/gs/vol3/software/modules-sw-test/mpfr/3.1.0/Linux/RHEL6/x86_64//lib/ -L/net/gs/vol3/software/modules-sw-test/mpc/0.8.2/Linux/RHEL6/x86_64//lib/ -L/net/gs/vol3/software/modules-sw-test/python/2.7.2/Linux/RHEL6/x86_64//lib/ build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/scipy/fftpack/_fftpackmodule.o build/temp.linux-x86_64-2.7/scipy/fftpack/src/zfft.o build/temp.linux-x86_64-2.7/scipy/fftpack/src/drfft.o build/temp.linux-x86_64-2.7/scipy/fftpack/src/zrfft.o build/temp.linux-x86_64-2.7/scipy/fftpack/src/zfftnd.o build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/scipy/fftpack/src/dct.o build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/fortranobject.o -L. -Lbuild/temp.linux-x86_64-2.7 -ldfftpack -lfftpack -lpython2.7 -lgfortran -o build/lib.linux-x86_64-2.7/scipy/fftpack/_fftpack.so /usr/lib/../lib64/crt1.o: In function `_start': (.text+0x20): undefined reference to `main' collect2: ld returned 1 exit status /usr/lib/../lib64/crt1.o: In function `_start': (.text+0x20): undefined reference to `main' collect2: ld returned 1 exit status error: Command "/net/gs/vol3/software/modules-sw-test/gcc/4.6.2/Linux/RHEL6/x86_64/bin/gfortran -Wall -L/net/gs/vol3/software/modules-sw-test/python/1.5.1/Linux/RHEL6/x86_64//lib/ -L/net/gs/vol3/software/modules-sw-test/ATLAS/3.9.63/Linux/RHEL6/x86_64//lib/ -L/net/gs/vol3/software/modules-sw-test/gcc/4.6.2/Linux/RHEL6/x86_64/lib64/ -L/net/gs/vol3/software/modules-sw-test/gcc/4.6.2/Linux/RHEL6/x86_64/lib/ -L/net/gs/vol3/software/modules-sw-test/gmp/5.0.2/Linux/RHEL6/x86_64//lib/ -L/net/gs/vol3/software/modules-sw-test/mpfr/3.1.0/Linux/RHEL6/x86_64//lib/ -L/net/gs/vol3/software/modules-sw-test/mpc/0.8.2/Linux/RHEL6/x86_64//lib/ -L/net/gs/vol3/software/modules-sw-test/python/2.7.2/Linux/RHEL6/x86_64//lib/ build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/scipy/fftpack/_fftpackmodule.o build/temp.linux-x86_64-2.7/scipy/fftpack/src/zfft.o build/temp.linux-x86_64-2.7/scipy/fftpack/src/drfft.o build/temp.linux-x86_64-2.7/scipy/fftpack/src/zrfft.o build/temp.linux-x86_64-2.7/scipy/fftpack/src/zfftnd.o build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/scipy/fftpack/src/dct.o build/temp.linux-x86_64-2.7/build/src.linux-x86_64-2.7/fortranobject.o -L. -Lbuild/temp.linux-x86_64-2.7 -ldfftpack -lfftpack -lpython2.7 -lgfortran -o build/lib.linux-x86_64-2.7/scipy/fftpack/_fftpack.so" failed with exit status 1 I've built ATLAS, LAPACK, BLAS, and numpy with a custom-built gfortran 4.6.2, and I've made sure every build uses that particular gfortran. Unfortunately, I can't use the RHEL-provided gcc/gfortran because it segfaults when trying to build LAPACK. I've also tried using the Oracle Studio compiler suite, with similar errors. I've tried various combinations of versions for each component involved, without success.[1] Has anyone else seen this problem and solved it? Thanks in advance for any help! [1] ATLAS (3.8.4, 3.9.35, and 3.9.63), numpy (1.5.1 and 1.6.1), LAPACK (3.3.0 and 3.4.0), and scipy (0.7.2, 0.8.0, 0.9.0, 0.10.0, 0.10.1rc1 and rc2). -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine From meier.benno at googlemail.com Sat Feb 25 09:53:15 2012 From: meier.benno at googlemail.com (Benno Meier) Date: Sat, 25 Feb 2012 15:53:15 +0100 Subject: [SciPy-User] Problem with complex bandpass filter Message-ID: <9CBF75F1-22E2-4E30-83CD-B7A4E7354B14@gmail.com> Dear all, I'm trying to implement a complex bandpass for IQ data in python using scipy.signal I first create a lowpass using irdesign (giving b and a) and then I transform the taps (b to b2). However, the filtered spectrum looks exactly the same as before. import scipy.signal.filter_design as ssfd from scipy.signal import lfilter [b, a] = ssfd.iirdesign(wp,ws,1,120) b2 = b*np.array(1j*2*np.pi*0.4*np.arange(len(b))) Can anyone help or point me to a routine that does the job? Thanks, Benno From aronne.merrelli at gmail.com Sat Feb 25 12:24:11 2012 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Sat, 25 Feb 2012 11:24:11 -0600 Subject: [SciPy-User] fft not giving Hermitian output for real input In-Reply-To: References: Message-ID: On Thu, Feb 23, 2012 at 8:15 PM, Jeff Alstott wrote: > I have a particular data set, d0, which is real: > > In [72]: d0 > Out[72]: > array([ 0.00907105, 0.0372916 , 0.01402867, ..., -0.04779497, > -0.07054817, -0.0436582 ]) > > It doesn't produce Hermitian output from numpy.fft.fft. > > > In [87]: y0 = fft.fft(d0) > In [88]: y0 > Out[88]: > array([ 7.77156117e-14 +0.00000000e+00j, > 6.89226454e-13 -1.56319402e-13j, > 1.72140080e-13 +0.00000000e+00j, ..., > -7.95807864e-13 -1.13686838e-13j, > -3.41060513e-13 +1.13686838e-13j, 1.25055521e-12 > -3.41060513e-13j]) > In [89]: y0[1].conj()==y0[-1] > Out[89]: False > In [90]: y0[1].conj()==y0[-2] > Out[90]: False > > I might be missing something, but this seems like floating point rounding error to me. I don't know what your d0 really looks like, but with some random numbers, In [1]: d0 = np.random.normal(0,1,1024) In [2]: y0 = np.fft.fft(d0) In [3]: np.allclose( y0[1:].conj(), y0[-1:0:-1] ) Out[3]: True In [4]: np.max( (y0[1:].conj() - y0[-1:0:-1]).imag ) Out[4]: 4.3520742565306136e-13 In [5]: np.max( (y0[1:].conj() - y0[-1:0:-1]).real ) Out[5]: 2.957634137601417e-13 I tried this in MATLAB and the two halves are exactly equal. I would assume this means that the MATLAB FFT implementation does something extra to eliminate the rounding error for the special case of real input, whereas the NumPy FFT does not take that step. I'm pretty sure the NumPy FFT is just calling FFTPACK, so you'd need to check the implementation details there. Cheers, Aronne -------------- next part -------------- An HTML attachment was scrubbed... URL: From eptune at gmail.com Sat Feb 25 14:17:13 2012 From: eptune at gmail.com (Erik Petigura) Date: Sat, 25 Feb 2012 11:17:13 -0800 Subject: [SciPy-User] Alternatives to scipy.optimize Message-ID: <26CD320A-00E5-4BBA-B4BB-110E18E99D7B@gmail.com> Dear Scipy, Up until now, I've found the optimize module very useful. Now, I'm finding that I need finer control. I am fitting a model to data that is of the following from: model = func1(p1) + func2(p2) func1 is nonlinear in its parameters and func2 is linear in its parameters. There are two things I am struggling with: 1. I'd like to find the best fit parameters for func1 using an iterative approach (e.g. simplex algorithm that changes p1.). At each iteration, I want to compute the optimum p2 by linear least squares in the interest of speed and robustness. 2. I'd also like the ability to hold certain parameters fixed in the optimization with out redefining my objective function each time. Is there another module you would recommend? I've found openopt, but I wanted to get some guidance before I dive in to that. Erik -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Feb 25 14:34:00 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 25 Feb 2012 14:34:00 -0500 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: <26CD320A-00E5-4BBA-B4BB-110E18E99D7B@gmail.com> References: <26CD320A-00E5-4BBA-B4BB-110E18E99D7B@gmail.com> Message-ID: On Sat, Feb 25, 2012 at 2:17 PM, Erik Petigura wrote: > Dear Scipy, > > Up until now, I've found the optimize module very useful. ?Now, I'm finding > that I need finer control. ?I am fitting a model to data that is of the > following from: > > model = func1(p1) + func2(p2) > > func1 is nonlinear in its parameters and func2 is linear in its parameters. > > > There are two things I am struggling with: > > 1. I'd like to find the best fit parameters for func1 using an iterative > approach (e.g. simplex algorithm that changes p1.). ?At each iteration, I > want to compute the optimum p2 by linear least squares in the interest of > speed and robustness. you can still do this with any regular optimizer like optimize.fmin, just calculate the linear solution inside the outer function that is optimized by fmin. I haven't seen any python package yet, that would estimate partial linear models. If you find a solution, then I would be interested in it for statsmodels. Josef > > 2. I'd also like the ability to hold certain parameters fixed in the > optimization with out redefining my objective function each time. > > > Is there another module you would recommend? ?I've found openopt, but I > wanted to get some guidance before I dive in to that. > > Erik > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cjordan1 at uw.edu Sat Feb 25 15:07:45 2012 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Sat, 25 Feb 2012 12:07:45 -0800 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: <26CD320A-00E5-4BBA-B4BB-110E18E99D7B@gmail.com> References: <26CD320A-00E5-4BBA-B4BB-110E18E99D7B@gmail.com> Message-ID: On Sat, Feb 25, 2012 at 11:17 AM, Erik Petigura wrote: > Dear Scipy, > > Up until now, I've found the optimize module very useful. ?Now, I'm finding > that I need finer control. ?I am fitting a model to data that is of the > following from: > > model = func1(p1) + func2(p2) > > func1 is nonlinear in its parameters and func2 is linear in its parameters. > > > There are two things I am struggling with: > > 1. I'd like to find the best fit parameters for func1 using an iterative > approach (e.g. simplex algorithm that changes p1.). ?At each iteration, I > want to compute the optimum p2 by linear least squares in the interest of > speed and robustness. > Are p1 and p2 coupled somehow? They must be, or computing p2 at each iteration wouldn't be relevant. This seems like a modification that could be made to the source code without too much trouble, if iirc. Alternatively, you could possibly use the callback option in fmin. > 2. I'd also like the ability to hold certain parameters fixed in the > optimization with out redefining my objective function each time. > > Could you pass in a lambda function with those parameters fixed? -Chris JS > Is there another module you would recommend? ?I've found openopt, but I > wanted to get some guidance before I dive in to that. > > Erik > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Sat Feb 25 18:33:29 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 25 Feb 2012 18:33:29 -0500 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: References: <26CD320A-00E5-4BBA-B4BB-110E18E99D7B@gmail.com> Message-ID: On Sat, Feb 25, 2012 at 3:07 PM, Christopher Jordan-Squire wrote: > On Sat, Feb 25, 2012 at 11:17 AM, Erik Petigura wrote: >> Dear Scipy, >> >> Up until now, I've found the optimize module very useful. ?Now, I'm finding >> that I need finer control. ?I am fitting a model to data that is of the >> following from: >> >> model = func1(p1) + func2(p2) >> >> func1 is nonlinear in its parameters and func2 is linear in its parameters. >> >> >> There are two things I am struggling with: >> >> 1. I'd like to find the best fit parameters for func1 using an iterative >> approach (e.g. simplex algorithm that changes p1.). ?At each iteration, I >> want to compute the optimum p2 by linear least squares in the interest of >> speed and robustness. >> > > Are p1 and p2 coupled somehow? They must be, or computing p2 at each > iteration wouldn't be relevant. assuming he meant y = f(x1,p1) + x2*p2 + error and minimizing for example sum of squared error then the estimation of p1 and p2 cannot be separated. a quickly written draft of how I would do it, (which might be added to statsmodels after cleanup, adding results statistics and testing). It should be possible to subclass and overwrite the nonlinear function/method predict_nonlin https://gist.github.com/1911544 no fixed parameters yet Josef > > This seems like a modification that could be made to the source code > without too much trouble, if iirc. Alternatively, you could possibly > use the callback option in fmin. > >> 2. I'd also like the ability to hold certain parameters fixed in the >> optimization with out redefining my objective function each time. >> >> > > Could you pass in a lambda function with those parameters fixed? > > -Chris JS > >> Is there another module you would recommend? ?I've found openopt, but I >> wanted to get some guidance before I dive in to that. >> >> Erik >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From eptune at gmail.com Sat Feb 25 21:21:43 2012 From: eptune at gmail.com (Erik Petigura) Date: Sat, 25 Feb 2012 18:21:43 -0800 Subject: [SciPy-User] Alternatives to scipy.optimize Message-ID: Thanks for getting back to me! I'd like to minimize p1 and p2 together. Let me try to describe my problem a little better: I'm trying to fit an exoplanet transit light curve. My model is a box + a polynomial trend. https://gist.github.com/1912265 The polynomial coefficients and the depth of the box are linear parameters, so I want to fit them using linear least squares. The center and width of the transit are non-linear so I need to fit them with an iterative approach like optimize.fmin. Here's how I implemented it. https://gist.github.com/1912281 There is a lot unpacking and repacking the parameter array as it gets passed around between functions. One option that might work would be to define functions based on a "parameter object". This parameter object could have attributes like float/fix, linear/non-linear. I found a more object oriented optimization module here: http://newville.github.com/lmfit-py/ However, it doesn't allow for linear fitting. Erik -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Feb 25 23:30:24 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 25 Feb 2012 21:30:24 -0700 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: References: <26CD320A-00E5-4BBA-B4BB-110E18E99D7B@gmail.com> Message-ID: On Sat, Feb 25, 2012 at 12:34 PM, wrote: > On Sat, Feb 25, 2012 at 2:17 PM, Erik Petigura wrote: > > Dear Scipy, > > > > Up until now, I've found the optimize module very useful. Now, I'm > finding > > that I need finer control. I am fitting a model to data that is of the > > following from: > > > > model = func1(p1) + func2(p2) > > > > func1 is nonlinear in its parameters and func2 is linear in its > parameters. > > > > > > There are two things I am struggling with: > > > > 1. I'd like to find the best fit parameters for func1 using an iterative > > approach (e.g. simplex algorithm that changes p1.). At each iteration, I > > want to compute the optimum p2 by linear least squares in the interest of > > speed and robustness. > > you can still do this with any regular optimizer like optimize.fmin, > just calculate the linear solution inside the outer function that is > optimized by fmin. > That's what I do: use leastsq and let it vary the p1 parameters which are passed to a function that uses linear least squares to compute the residuals of the linear least squares problem func2(p2) = data - func1(p1). The residuals are the the values returned to leastsq. The func2 doesn't even have to be linear if the solution can be easily computed for subsets of the data. I've used this for fits involving hundreds of quaternions as the p2 and a far smaller number of p1. > > 2. I'd also like the ability to hold certain parameters fixed in the > > optimization with out redefining my objective function each time. > This is trickier, but leastsq will generally work if you just ignore some of the p1 parameters. Better would be to adjust the number of parameters passed to the inner function, but that is more complicated. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Feb 26 00:21:05 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 26 Feb 2012 00:21:05 -0500 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: References: Message-ID: On Sat, Feb 25, 2012 at 9:21 PM, Erik Petigura wrote: > Thanks for getting back to me! > > I'd like to minimize p1 and p2 together. ?Let me try to describe my problem > a little better: > > I'm trying to fit an exoplanet transit light curve. ?My model is a box + a > polynomial trend. > > https://gist.github.com/1912265 > > The polynomial coefficients and the depth of the box are linear parameters, > so I want to fit them using linear least squares. ?The center and width of > the transit are non-linear so I need to fit them with an iterative approach > like optimize.fmin. ?Here's how I implemented it. > > https://gist.github.com/1912281 Took me a while to work my way through it, especially that you have a linear coefficient in front of the nonlinear part. The idea of calculating the nonlinear part by setting the linear coefficients to "neutral" values is nice. Is (((p-pNL0)/dpNL0)**2).sum() a penalization term? Since your objective function is quadratic optimize.leastsq might work better than fmin. The next numpy version will have a vander function to build Legendre polynomials. (Or maybe it already has, I'm on 1.5) The next thing will be to get the covariance matrix for the parameter estimates :) > > There is a lot unpacking and repacking the parameter array as it gets passed > around between functions. ?One option that might work would be to define > functions based on a "parameter object". ?This parameter object could have > attributes like float/fix, linear/non-linear. ?I found a more object > oriented optimization module here: > > http://newville.github.com/lmfit-py/ The easiest is to just write some helper functions to stack or unstack the parameters, or set some to fixed. In statsmodels we use this in some cases (as methods since our models are classes), also to transform parameters. Since often this affects groups of parameters, I don't know if the lmfit approach would helps in this case. (Personally, I like numpy arrays with masks or fancy indexing, which is easy to understand. Ast manipulation scares me.) Josef > > However, it doesn't allow for linear fitting. > > Erik > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From markbak at gmail.com Sun Feb 26 05:49:45 2012 From: markbak at gmail.com (Mark Bakker) Date: Sun, 26 Feb 2012 11:49:45 +0100 Subject: [SciPy-User] Does scipy binary install libgfortran.dylib? Message-ID: Hello List, Does Scipy install the correct version of libgfortran.dylib? Does it simply put it in /usr/local/lib/ ? I am trying to distribute my own package which includes FORTRAN extensions and when installing on a brand new machine it complains that libgfortran.3.dylib cannot be found. I was wondering how Scipy handles this (and thanks to Ralf Gommers for helping me so far, but I haven't been able to solve this). Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Sun Feb 26 06:15:52 2012 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Sun, 26 Feb 2012 12:15:52 +0100 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: References: Message-ID: Am 26.2.2012 um 03:21 schrieb Erik Petigura: > Thanks for getting back to me! > > I'd like to minimize p1 and p2 together. Let me try to describe my problem a little better: > > I'm trying to fit an exoplanet transit light curve. My model is a box + a polynomial trend. > > https://gist.github.com/1912265 > > The polynomial coefficients and the depth of the box are linear parameters, so I want to fit them using linear least squares. The center and width of the transit are non-linear so I need to fit them with an iterative approach like optimize.fmin. Here's how I implemented it. > > https://gist.github.com/1912281 > I didn't look in detail at your code, but it seems to me the approach described e.g. in Separable NonLinear Least Squares would be a good choice for you, especially since you are able to analytically calculate the derivatives. The method is similar to the approach you chose, it first solves a linear least squares problem to determine estimates for the linear parameters. This information is then used to calculate the derivatives (Jacobian) with respect to the nonlinear parameters for an iterative minimization (Levenberq-Marquardt). I have a python implementation, if you are interested, I can share the code - but it's poorly documented. Gregor -------------- next part -------------- An HTML attachment was scrubbed... URL: From epetigura at berkeley.edu Sat Feb 25 14:16:28 2012 From: epetigura at berkeley.edu (Erik Petigura) Date: Sat, 25 Feb 2012 11:16:28 -0800 Subject: [SciPy-User] Alternatives to scipy.optimize Message-ID: Dear Scipy, Up until now, I've found the optimize module very useful. Now, I'm finding that I need finer control. I am fitting a model to data that is of the following from: model = func1(p1) + func2(p2) func1 is nonlinear in its parameters and func2 is linear in its parameters. There are two things I am struggling with: 1. I'd like to find the best fit parameters for func1 using an iterative approach (e.g. simplex algorithm that changes p1.). At each iteration, I want to compute the optimum p2 by linear least squares in the interest of speed and robustness. 2. I'd also like the ability to hold certain parameters fixed in the optimization with out redefining my objective function each time. Is there another module you would recommend? I've found openopt, but I wanted to get some guidance before I dive in to that. Erik -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.gullikson at gmail.com Sat Feb 25 14:31:55 2012 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Sat, 25 Feb 2012 13:31:55 -0600 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: <26CD320A-00E5-4BBA-B4BB-110E18E99D7B@gmail.com> References: <26CD320A-00E5-4BBA-B4BB-110E18E99D7B@gmail.com> Message-ID: Erik, You can do the least-squares fit to func2 within the function that you pass to scipy.optimize.leastsq (or similar). For fixed parameters, I use a second array called const_pars, and pass it to leastsq as one of the arguments (e.g. leastsq(ErrFunc, pars, args=(const_pars)) ) For example: def func1(p1): //some non-linear function of p1 def func2(p2): //some linear function of p2 def ErrFunc(p1,p2): //Do linear fit to func2 and optimize parameters p1 return func1(p1) + func2(p2) //Run leastsq: pars, success = scipy.optimize.leastsq(ErrFunc, p1, args=(p2)) Hope that helps! Kevin Gullikson On Sat, Feb 25, 2012 at 1:17 PM, Erik Petigura wrote: > Dear Scipy, > > Up until now, I've found the optimize module very useful. Now, I'm > finding that I need finer control. I am fitting a model to data that is of > the following from: > > model = func1(p1) + func2(p2) > > func1 is nonlinear in its parameters and func2 is linear in its > parameters. > > There are two things I am struggling with: > > 1. I'd like to find the best fit parameters for func1 using an iterative > approach (e.g. simplex algorithm that changes p1.). At each iteration, I > want to compute the optimum p2 by linear least squares in the interest of > speed and robustness. > > 2. I'd also like the ability to hold certain parameters fixed in the > optimization with out redefining my objective function each time. > > > Is there another module you would recommend? I've found openopt, but I > wanted to get some guidance before I dive in to that. > > Erik > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mobilebackup77 at gmail.com Sat Feb 25 16:35:46 2012 From: mobilebackup77 at gmail.com (Me Myself) Date: Sat, 25 Feb 2012 16:35:46 -0500 Subject: [SciPy-User] ndimage.sobel multiple passes? Message-ID: In my code base, currently I need to compute sobel on a 3d dataset. I do this using: # ndimage.sobel dx = sobel(vdata, 0) dy = sobel(vdata, 1) dz = sobel(vdata, 2) any ideas how this can be done using one pass instead of doing this in 3 passes. My dataset is large and it would be nice to speed this up. Thanks, --R -------------- next part -------------- An HTML attachment was scrubbed... URL: From guziy.sasha at gmail.com Sun Feb 26 10:38:56 2012 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Sun, 26 Feb 2012 10:38:56 -0500 Subject: [SciPy-User] ndimage.sobel multiple passes? In-Reply-To: References: Message-ID: Hi, you could try multiprocessing, if you have more than one core You could speed it up 3 times at least. -- Oleksandr Huziy 2012/2/25 Me Myself : > > In my code base, currently I need to compute sobel on a 3d dataset. I do > this using: > > # ndimage.sobel > ??? dx = sobel(vdata, 0) > ??? dy = sobel(vdata, 1) > ??? dz = sobel(vdata, 2) > > any ideas how this can be done using one pass instead of doing this in 3 > passes. My dataset is large and it would be nice to speed this up. > > Thanks, > --R > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From guziy.sasha at gmail.com Sun Feb 26 10:41:50 2012 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Sun, 26 Feb 2012 10:41:50 -0500 Subject: [SciPy-User] ndimage.sobel multiple passes? In-Reply-To: References: Message-ID: Rughly I would do the following def worker(arg): index, data = arg return sobel(vdata, index) p = Pool() data_list = 3 * [data] result = p.map(worker, zip(xrange(3), data_list)) Cheers -- Oleksandr Huziy 2012/2/26 Oleksandr Huziy : > Hi, > > you could try multiprocessing, if you have more than one core > > You could speed it up 3 times at least. > -- > Oleksandr Huziy > > 2012/2/25 Me Myself : >> >> In my code base, currently I need to compute sobel on a 3d dataset. I do >> this using: >> >> # ndimage.sobel >> ??? dx = sobel(vdata, 0) >> ??? dy = sobel(vdata, 1) >> ??? dz = sobel(vdata, 2) >> >> any ideas how this can be done using one pass instead of doing this in 3 >> passes. My dataset is large and it would be nice to speed this up. >> >> Thanks, >> --R >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From ralf.gommers at googlemail.com Sun Feb 26 11:38:19 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 26 Feb 2012 17:38:19 +0100 Subject: [SciPy-User] Does scipy binary install libgfortran.dylib? In-Reply-To: References: Message-ID: On Sun, Feb 26, 2012 at 11:49 AM, Mark Bakker wrote: > Hello List, > > Does Scipy install the correct version of libgfortran.dylib? Does it > simply put it in /usr/local/lib/ ? > > I am trying to distribute my own package which includes FORTRAN extensions > and when installing on a brand new machine it complains that > libgfortran.3.dylib cannot be found. I was wondering how Scipy handles this > (and thanks to Ralf Gommers for helping me so far, but I haven't been able > to solve this). > Sorry for not being clearer before - I only knew that this worked for scipy, not exactly how it worked. After some digging I found this in the 0.7.1 release notes: "Mac OS X binary installer is now a proper universal build, and does not depend on gfortran anymore (libgfortran is statically linked)." Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Feb 26 16:43:27 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 26 Feb 2012 22:43:27 +0100 Subject: [SciPy-User] Scipy test failure when building on Scientific Linux 6.0 In-Reply-To: References: Message-ID: On Fri, Feb 24, 2012 at 12:48 PM, Dugan Witherick wrote: > I'm trying to build numpy (1.6.1) and scipy (0.10.1rc2) on Scientific > Linux 6.0. I've successfully managed to build both packages from source > using > > python setup.py config_fc --fcompiler=gnu95 install > > but while numpy passes its tests, scipy doesn't: > > >>> scipy.test() > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in /usr/lib64/python2.6/site-packages/numpy > SciPy version 0.10.1rc2 > SciPy is installed in /usr/lib64/python2.6/site-packages/scipy > Python version 2.6.6 (r266:84292, May 20 2011, 16:42:11) [GCC 4.4.5 > 20110214 (Red Hat 4.4.5-6)] > nose version 0.10.4 > > ---SKIPPED---- > > ====================================================================== > ERROR: test_qhull.TestTriangulation.test_pathological > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in > runTest > self.test(*self.arg) > File > "/usr/lib64/python2.6/site-packages/scipy/spatial/tests/test_qhull.py", > line 216, in test_pathological > assert_equal(tri.points[tri.vertices].max(), > ValueError: zero-size array to maximum.reduce without identity > > ====================================================================== > FAIL: test_interpnd.TestCloughTocher2DInterpolator.test_dense > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in > runTest > self.test(*self.arg) > File > "/usr/lib64/python2.6/site-packages/scipy/interpolate/tests/test_interpnd.py", > line 183, in test_dense > err_msg="Function %d" % j) > File > "/usr/lib64/python2.6/site-packages/scipy/interpolate/tests/test_interpnd.py", > line 132, in _check_accuracy > assert_allclose(a, b, **kw) > File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line > 1168, in assert_allclose > verbose=verbose, header=header) > File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line > 605, in assert_array_compare > chk_same_position(x_id, y_id, hasval='nan') > File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line > 588, in chk_same_position > raise AssertionError(msg) > AssertionError: > Not equal to tolerance rtol=0.01, atol=0.005 > Function 0 > x and y nan location mismatch: > x: array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, > nan, > nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, > nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,... > y: array([ 3.66796999e-02, 1.91605573e-01, 6.08362261e-01, > 7.64324844e-02, 9.18031021e-01, 1.28033199e-01, > 4.67121584e-01, 1.37085621e-01, 2.53092671e-01,... > > ---SKIPPED several other fails--- > > ---------------------------------------------------------------------- > Ran 5102 tests in 80.529s > > FAILED (KNOWNFAIL=13, SKIP=35, errors=1, failures=19) > > > numpy/scipy are being built against lapack (3.2.1), blas (3.2.1) and atlas > (3.8.3) from the standard Scientific Linux repository. I would appreciate > any advice/suggestions on where I might be going wrong. > Could you post all the test failures and the build log? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From markbak at gmail.com Mon Feb 27 07:28:30 2012 From: markbak at gmail.com (Mark Bakker) Date: Mon, 27 Feb 2012 13:28:30 +0100 Subject: [SciPy-User] Does scipy binary install libgfortran.dylib? Message-ID: Excellent, so it can be done. The next question is: How? I found a suggestion from Brian Toby on the web from this summer, but that didn't work form me. This is what I did: On my Mac Terminal I type: LDFLAGS='-undefined dynamic_lookup -bundle -static-libgfortran -static-libgcc' f2py -c -m besselaes besselaes.f95 This nicely creates the extension, which I can run on the machine I created it on, but if I move it to a machine that doesn't have the libgfortran.3.dylib file, it doesn't run and complains about that, so I conclude that the dynamic link failed (the size of the extension doesn't change when I set the LDFLAGS, which I thought was a bad omen). Any thoughts? Am I doing something wrong? Thanks for any help, Mark Date: Sun, 26 Feb 2012 17:38:19 +0100 > From: Ralf Gommers > Subject: Re: [SciPy-User] Does scipy binary install libgfortran.dylib? > To: SciPy Users List > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > On Sun, Feb 26, 2012 at 11:49 AM, Mark Bakker wrote: > > > Hello List, > > > > Does Scipy install the correct version of libgfortran.dylib? Does it > > simply put it in /usr/local/lib/ ? > > > > I am trying to distribute my own package which includes FORTRAN > extensions > > and when installing on a brand new machine it complains that > > libgfortran.3.dylib cannot be found. I was wondering how Scipy handles > this > > (and thanks to Ralf Gommers for helping me so far, but I haven't been > able > > to solve this). > > > > Sorry for not being clearer before - I only knew that this worked for > scipy, not exactly how it worked. After some digging I found this in the > 0.7.1 release notes: "Mac OS X binary installer is now a proper universal > build, and does not depend on gfortran anymore (libgfortran is statically > linked)." > > Ralf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.newville at gmail.com Sun Feb 26 10:14:37 2012 From: matt.newville at gmail.com (Matthew Newville) Date: Sun, 26 Feb 2012 07:14:37 -0800 (PST) Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: References: Message-ID: <11398732.11.1330269277150.JavaMail.geo-discussion-forums@ynca15> Hi Erik, Josef, On Saturday, February 25, 2012 8:21:43 PM UTC-6, Erik Petigura wrote: > > Thanks for getting back to me! > I'd like to minimize p1 and p2 together. Let me try to describe my problem a little better: > > I'm trying to fit an exoplanet transit light curve. My model is a box + a polynomial trend. > > https://gist.github.com/1912265 > > The polynomial coefficients and the depth of the box are linear parameters, so I want to > fit them using linear least squares. The center and width of the transit are non-linear > so I need to fit them with an iterative approach like optimize.fmin. > Here's how I implemented it. > > https://gist.github.com/1912281 I'm not sure I fully follow your model, but if I understand correctly, you're looking to find optimal parameters for something like model = linear_function(p1) + nonlinear_function(p2) for sets of coefficients p1 and p2, each set having a few fitting variables, some of which may be related. Is there an instability that prevents you from just treating this as a single non-linear model? Another option might be to have the residual function for scipy.optimize.leastsq (or lmfit) call numpy.linalg.lstsq at each iteration. I would think that more fully explore the parameter space than first fitting nonlinear_function with scipy.optimize.fmin() then passing those best-fit parameters to numpy.linalg.lstsq(), but perhaps I'm not fully understanding the nature of the problem. > There is a lot unpacking and repacking the parameter array as it gets passed around > between functions. One option that might work would be to define functions based on a > "parameter object". This parameter object could have attributes like float/fix, > linear/non-linear. I found a more object oriented optimization module here: > http://newville.github.com/lmfit-py/ > > However, it doesn't allow for linear fitting. Linear fitting could probably be added to lmfit, though I haven't looked into it. For this problem, I would pursue the idea of treating your fitting problem as a single model for non-linear least squares with optimize.leastsq or with lmfit. Perhaps I missing something about your model that makes this approach unusually challenging. Josef P wrote: > The easiest is to just write some helper functions to stack or unstack > the parameters, or set some to fixed. In statsmodels we use this in > some cases (as methods since our models are classes), also to > transform parameters. > Since often this affects groups of parameters, I don't know if the > lmfit approach would helps in this case. If many people who are writing their own model functions find themselves writing similar helper functions to stack and unstack parameters, "the easiest" here might not be "the best", and providing tools to do this stacking and unstacking might be worthwhile. Lmfit tries to do this. > (Personally, I like numpy arrays with masks or fancy indexing, which > is easy to understand. Ast manipulation scares me.) I don't understand how masks or fancy indexing would help here. How would that work? FWIW, lmfit uses python's ast module only for algebraic constraints between parameters. That is, from lmfit import Parameter Parameter(name='a', value=10, vary=True) Parameter(name='b', expr='sqrt(a) + 1') will compile 'sqrt(a)+1' into its AST representation and evaluate that for the value of 'b' when needed. So lmfit doesn't so much manipulate the AST as interpret it. What is manipulated is the namespace, so that 'a' is interpreted as "look up the current value of Parameter 'a'" when the AST is evaluated. Again, this applies only for algebraic constraints on parameters. Having written fitting programs that support user-supplied algebraic constraints between parameters in Fortran77, I find interpreting python's AST to be remarkably simple and robust. I'm scared much more by statistical modeling of economic data ;) Cheers, --Matt Newville -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.witherick at ucl.ac.uk Mon Feb 27 09:47:19 2012 From: d.witherick at ucl.ac.uk (Dugan Witherick) Date: Mon, 27 Feb 2012 14:47:19 +0000 Subject: [SciPy-User] Scipy test failure when building on Scientific Linux 6.0 In-Reply-To: References: Message-ID: Dear Ralf, I've attached the build log and test log to this message. I'm guessing that mail board will scrub the attachments and replace them with links but I thought it better to do it this way than fill people's inboxes with long messages. Please say if you think it is better to just inline the logs. Dugan On 26 February 2012 21:43, Ralf Gommers wrote: > > > On Fri, Feb 24, 2012 at 12:48 PM, Dugan Witherick wrote: > >> I'm trying to build numpy (1.6.1) and scipy (0.10.1rc2) on Scientific >> Linux 6.0. I've successfully managed to build both packages from source >> using >> >> python setup.py config_fc --fcompiler=gnu95 install >> >> but while numpy passes its tests, scipy doesn't: >> >> >>> scipy.test() >> Running unit tests for scipy >> NumPy version 1.6.1 >> NumPy is installed in /usr/lib64/python2.6/site-packages/numpy >> SciPy version 0.10.1rc2 >> SciPy is installed in /usr/lib64/python2.6/site-packages/scipy >> Python version 2.6.6 (r266:84292, May 20 2011, 16:42:11) [GCC 4.4.5 >> 20110214 (Red Hat 4.4.5-6)] >> nose version 0.10.4 >> >> ---SKIPPED---- >> >> ====================================================================== >> ERROR: test_qhull.TestTriangulation.test_pathological >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in >> runTest >> self.test(*self.arg) >> File >> "/usr/lib64/python2.6/site-packages/scipy/spatial/tests/test_qhull.py", >> line 216, in test_pathological >> assert_equal(tri.points[tri.vertices].max(), >> ValueError: zero-size array to maximum.reduce without identity >> >> ====================================================================== >> FAIL: test_interpnd.TestCloughTocher2DInterpolator.test_dense >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in >> runTest >> self.test(*self.arg) >> File >> "/usr/lib64/python2.6/site-packages/scipy/interpolate/tests/test_interpnd.py", >> line 183, in test_dense >> err_msg="Function %d" % j) >> File >> "/usr/lib64/python2.6/site-packages/scipy/interpolate/tests/test_interpnd.py", >> line 132, in _check_accuracy >> assert_allclose(a, b, **kw) >> File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line >> 1168, in assert_allclose >> verbose=verbose, header=header) >> File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line >> 605, in assert_array_compare >> chk_same_position(x_id, y_id, hasval='nan') >> File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line >> 588, in chk_same_position >> raise AssertionError(msg) >> AssertionError: >> Not equal to tolerance rtol=0.01, atol=0.005 >> Function 0 >> x and y nan location mismatch: >> x: array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, >> nan, >> nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, >> nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, >> nan,... >> y: array([ 3.66796999e-02, 1.91605573e-01, 6.08362261e-01, >> 7.64324844e-02, 9.18031021e-01, 1.28033199e-01, >> 4.67121584e-01, 1.37085621e-01, 2.53092671e-01,... >> >> ---SKIPPED several other fails--- >> >> ---------------------------------------------------------------------- >> Ran 5102 tests in 80.529s >> >> FAILED (KNOWNFAIL=13, SKIP=35, errors=1, failures=19) >> >> >> numpy/scipy are being built against lapack (3.2.1), blas (3.2.1) and >> atlas (3.8.3) from the standard Scientific Linux repository. I would >> appreciate any advice/suggestions on where I might be going wrong. >> > > Could you post all the test failures and the build log? > > Ralf > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 533543 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test.log Type: application/octet-stream Size: 31799 bytes Desc: not available URL: From josef.pktd at gmail.com Mon Feb 27 10:56:30 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 27 Feb 2012 10:56:30 -0500 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: <11398732.11.1330269277150.JavaMail.geo-discussion-forums@ynca15> References: <11398732.11.1330269277150.JavaMail.geo-discussion-forums@ynca15> Message-ID: On Sun, Feb 26, 2012 at 10:14 AM, Matthew Newville wrote: > > Hi Erik, Josef, > > > On Saturday, February 25, 2012 8:21:43 PM UTC-6, Erik Petigura wrote: >> >> Thanks for getting back to me! >> I'd like to minimize p1 and p2 together.? Let me try to describe my >> problem a little better: >> >> I'm trying to fit an exoplanet transit light curve.? My model is a box + a >> polynomial trend. >> >> https://gist.github.com/1912265 >> >> The polynomial coefficients and the depth of the box are linear >> parameters, so I want to >> fit them using linear least squares.? The center and width of the transit >> are non-linear >> so I need to fit them with an iterative approach like optimize.fmin. >> Here's how I implemented it. >> >> https://gist.github.com/1912281 > > I'm not sure I fully follow your model, but if I understand correctly, > you're looking to find optimal parameters for something like > ? model = linear_function(p1) + nonlinear_function(p2) yes, I've read about this mostly in the context of semiparametric versions when the nonlinear function does not have a parametric form. http://en.wikipedia.org/wiki/Semiparametric_regression#Partially_linear_models > > for sets of coefficients p1 and p2, each set having a few fitting variables, > some of which may be related.? Is there an instability that prevents you > from just treating this as a single non-linear model? I think p1 and p2 shouldn't have any cross restrictions or it will get a bit more complicated. The main reason for splitting it up is computational, I think. It's quite common in econometrics to "concentrate out" parameters that have an explicit solution, so we need the nonlinear optimization only for a smaller parameter space. > > Another option might be to have the residual function for > scipy.optimize.leastsq (or lmfit) call numpy.linalg.lstsq at each > iteration.? I would think that more fully explore the parameter space than > first fitting nonlinear_function with scipy.optimize.fmin() then passing > those best-fit parameters to numpy.linalg.lstsq(), but perhaps I'm not fully > understanding the nature of the problem. That's what both of us did, my version https://gist.github.com/1911544 > > >> There is a lot unpacking and repacking the parameter array as it gets >> passed around >> between functions.? One option that might work would be to define >> functions based on a >> "parameter object".? This parameter object could have attributes like >> float/fix, >> linear/non-linear.? I found a more object oriented optimization module >> here: >> http://newville.github.com/lmfit-py/ >> >> However, it doesn't allow for linear fitting. > > Linear fitting could probably be added to lmfit, though I haven't looked > into it.?? For this problem, I would pursue the idea of treating your > fitting problem as a single model for non-linear least squares with > optimize.leastsq or with lmfit.?? Perhaps I missing something about your > model that makes this approach unusually challenging. > > > > Josef P wrote: > >> The easiest is to just write some helper functions to stack or unstack >> the parameters, or set some to fixed. In statsmodels we use this in >> some cases (as methods since our models are classes), also to >> transform parameters. >> Since often this affects groups of parameters, I don't know if the >> lmfit approach would helps in this case. > > If many people who are writing their own model functions find themselves > writing similar helper functions to stack and unstack parameters, "the > easiest" here might not be "the best", and providing tools to do this > stacking and unstacking might be worthwhile.?? Lmfit tries to do this. > >> (Personally, I like numpy arrays with masks or fancy indexing, which >> is easy to understand. Ast manipulation scares me.) > > I don't understand how masks or fancy indexing would help here. How would > that work? ---- a few examples how we currently handle parameter restrictions in statsmodels Skippers example: transform parameters in the loglikelihood to force the parameter estimates of an ARMA process to produce a stationary (stable) solution - transforms groups of parameters at once https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/arima_model.py#L230 https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/arima_model.py#L436 not a clean example: fitting distributions with some frozen parameters https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/sandbox/distributions/sppatch.py#L267 select parameters that are not frozen to use with fmin x0 = np.array(x0)[np.isnan(frmask)] expand the parameters again to include the frozen parameters inside the loglikelihood function theta = frmask.copy() theta[np.isnan(frmask)] = thetash It's not as user friendly as the version that got into scipy.stats.distribution, (but it's developer friendly because I don't have to stare at it for hours to spot a bug) structural Vector Autoregression uses a similar mask pattern https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/vector_ar/svar_model.py#L89 In another sandbox model I build a nested dictionary to map the model parameters to the reduced list that can be fed to scipy.optimize.fmin_xxx But we don't have user defined nonlinear constraints yet. --------- > > FWIW, lmfit uses python's ast module only for algebraic constraints between > parameters.? That is, > ??? from lmfit import Parameter > ??? Parameter(name='a', value=10, vary=True) > ??? Parameter(name='b', expr='sqrt(a) + 1') > > will compile 'sqrt(a)+1' into its AST representation and evaluate that for > the value of 'b' when needed.? So lmfit doesn't so much manipulate the AST > as interpret it.? What is? manipulated is the namespace, so that 'a' is > interpreted as "look up the current value of Parameter 'a'" when the AST is > evaluated.?? Again, this applies only for algebraic constraints on > parameters. It's a bit similar to a formula framework for specifying a statistical model that is to be estimated (with lengthy discussion on the statsmodels list). I see the advantages but I haven't spent the weeks of time to figure out what's behind the machinery that is required (especially given all the other statistics and econometrics that is missing, and where I only have to worry about how numpy, scipy and statsmodels behaves. And I like Zen of Python #2) > > Having written fitting programs that support user-supplied algebraic > constraints between parameters in Fortran77, I find interpreting python's > AST to be remarkably simple and robust.? I'm scared much more by statistical > modeling of economic data ;) different tastes and applications. I'd rather think about the next 20 statistical tests I want to code, than about the AST or how sympy translates into numpy. Cheers, Josef > > Cheers, > > --Matt Newville > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From njs at pobox.com Mon Feb 27 11:33:00 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 27 Feb 2012 16:33:00 +0000 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: <11398732.11.1330269277150.JavaMail.geo-discussion-forums@ynca15> References: <11398732.11.1330269277150.JavaMail.geo-discussion-forums@ynca15> Message-ID: On Sun, Feb 26, 2012 at 3:14 PM, Matthew Newville wrote: > ??? from lmfit import Parameter > ??? Parameter(name='a', value=10, vary=True) > ??? Parameter(name='b', expr='sqrt(a) + 1') > > will compile 'sqrt(a)+1' into its AST representation and evaluate that for > the value of 'b' when needed.? So lmfit doesn't so much manipulate the AST > as interpret it.? What is? manipulated is the namespace, so that 'a' is > interpreted as "look up the current value of Parameter 'a'" when the AST is > evaluated.?? Again, this applies only for algebraic constraints on > parameters. > > Having written fitting programs that support user-supplied algebraic > constraints between parameters in Fortran77, I find interpreting python's > AST to be remarkably simple and robust.? I'm scared much more by statistical > modeling of economic data ;) So you use the 'ast' module to convert Python source into a syntax tree, and then you wrote an interpreter for that syntax tree? ...Wouldn't it be easier to use Python's interpreter instead of writing your own, i.e., just call eval() on the source code? Or are you just using the AST to figure out which variables are referenced? (I have some code to do just that without going through the ast module, on the theory that it's nice to be compatible with python 2.5, but I'm not sure it's really worth it.) -- Nathaniel From greg.friedland at gmail.com Mon Feb 27 13:15:28 2012 From: greg.friedland at gmail.com (Greg Friedland) Date: Mon, 27 Feb 2012 10:15:28 -0800 Subject: [SciPy-User] Confidence interval for bounded minimization In-Reply-To: References: Message-ID: Thanks all for the detailed responses and discussion regarding my question. It was very helpful in pointing me in the right direction. Greg On Wed, Feb 22, 2012 at 10:23 PM, wrote: > On Thu, Feb 23, 2012 at 1:10 AM, ? wrote: >> On Thu, Feb 23, 2012 at 12:09 AM, Christopher Jordan-Squire >> wrote: >>> On Wed, Feb 22, 2012 at 2:02 PM, Nathaniel Smith wrote: >>>> On Wed, Feb 22, 2012 at 8:48 PM, ? wrote: >>>>> On Wed, Feb 22, 2012 at 3:26 PM, Greg Friedland >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> Is it possible to calculate asymptotic confidence intervals for any of >>>>>> the bounded minimization algorithms? As far as I can tell they don't >>>>>> return the Hessian; that's including the new 'minimize' function which >>>>>> seemed like it might. >>>>> >>>>> If the parameter ends up at the bounds, then the standard statistics >>>>> doesn't apply. The Hessian is based on a local quadratic >>>>> approximation, which doesn't work if part of the local neigborhood is >>>>> out of bounds. >>>>> There is some special statistics for this, but so far I have seen only >>>>> the description how GAUSS handles it. >>>>> >>>>> In statsmodels we use in some cases the bounds, or a transformation, >>>>> just to keep the optimizer in the required range, and we assume we get >>>>> an interior solution. In this case, it is possible to use the standard >>>>> calculations, the easiest is to use the local minimum that the >>>>> constraint or transformed optimizer found and use it as starting value >>>>> for an unconstrained optimization where we can get the Hessian (or >>>>> just calculate the Hessian based on the original objective function). >>>> >>>> Some optimizers compute the Hessian internally. In those cases, it >>>> would be nice to have a way to ask them to somehow return that value >>>> instead of throwing it away. I haven't used Matlab in a while, but I >>>> remember running into this as a standard feature at some point, and it >>>> was quite nice. Especially when working with a problem where each >>>> computation of the Hessian requires an hour or so of computing time. >>>> >>> >>> Are you talking about analytic or finite-difference gradients and >>> hessians? I'd assumed that anything derived from finite difference >>> estimations wouldn't give particularly good confidence intervals, but >>> I've never needed them so I've never looked into it in detail. >> >> statsmodels has both, all discrete models for example have analytical >> gradients and hessians. >> >> But for models with a complicated log-likelihood function, there isn't >> much choice, second derivatives with centered finite differences are >> ok, scipy.optimize.leastsq is not very good. statsmodels also has >> complex derivatives which are numerically pretty good but they cannot >> always be used. >> >> I think in most cases numerical derivatives will have a precision of a >> few decimals, which is more precise than all the other statistical >> assumptions, normality, law of large numbers, local definition of >> covariance matrix to calculate "large" confidence intervals, and so >> on. >> >> One problem is that choosing the step size depends on the data and >> model. numdifftools has adaptive calculations for the derivatives, but >> we are not using it anymore. >> >> Also, if the model is not well specified, then the lower precision of >> finite difference derivatives can hurt. For example, in ARMA models I >> had problems when there are too many lags specified, so that some >> roots should almost cancel. Skipper's implementation works better >> because he used a reparameterization that forces some nicer behavior. >> >> The only case in the econometrics literature that I know is that early >> GARCH models were criticized for using numerical derivatives even >> though analytical derivatives were available, some parameters were not >> well estimated, although different estimates produced essentially the >> same predictions (parameters are barely identified) >> >> Last defense: everyone else does it, maybe a few models more or less, >> and if the same statistical method is used, then the results usually >> agree pretty well. >> (But if different methods are used, for example initial conditions are >> treated differently in time series analysis, then the differences are >> usually much larger. Something like: I don't worry about numerical >> problems at the 5th or 6th decimal if I cannot figure out what these >> guys are doing with their first and second decimal.) >> >> (maybe more than anyone wants to know.) > > In case it wasn't clear: analytical derivatives are of course much > better, and I would be glad if the scipy.stats.distributions or sympy > had the formulas for the derivatives of the log-likelihood functions > for the main distributions. (but it's work) > > Josef > > >> >> Josef >> . >> >>> >>> -Chris >>> >>> >>>> -- Nathaniel >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From cimrman3 at ntc.zcu.cz Mon Feb 27 13:39:04 2012 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 27 Feb 2012 19:39:04 +0100 Subject: [SciPy-User] ANN: SfePy 2012.1 Message-ID: <4F4BCDC8.6070202@ntc.zcu.cz> I am pleased to announce release 2012.1 of SfePy. Description ----------- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method. The code is based on NumPy and SciPy packages. It is distributed under the new BSD license. Home page: http://sfepy.org Downloads, mailing list, wiki: http://code.google.com/p/sfepy/ Git (source) repository, issue tracker: http://github.com/sfepy Highlights of this release -------------------------- - initial version of linearizer of higher order solutions - rewrite variable and evaluate cache history handling - lots of term updates/fixes/simplifications - move web front page to sphinx docs For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Best regards, Robert Cimrman and Contributors (*) (*) Contributors to this release (alphabetical order): Tom Aldcroft, Vladim?r Luke?, Maty?? Nov?k, Andre Smit From newville at cars.uchicago.edu Mon Feb 27 14:12:11 2012 From: newville at cars.uchicago.edu (Matt Newville) Date: Mon, 27 Feb 2012 13:12:11 -0600 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: References: <11398732.11.1330269277150.JavaMail.geo-discussion-forums@ynca15> Message-ID: Hi Nathaniel, On Mon, Feb 27, 2012 at 10:33 AM, Nathaniel Smith wrote: > On Sun, Feb 26, 2012 at 3:14 PM, Matthew Newville > wrote: >> ??? from lmfit import Parameter >> ??? Parameter(name='a', value=10, vary=True) >> ??? Parameter(name='b', expr='sqrt(a) + 1') >> >> will compile 'sqrt(a)+1' into its AST representation and evaluate that for >> the value of 'b' when needed.? So lmfit doesn't so much manipulate the AST >> as interpret it.? What is? manipulated is the namespace, so that 'a' is >> interpreted as "look up the current value of Parameter 'a'" when the AST is >> evaluated.?? Again, this applies only for algebraic constraints on >> parameters. >> >> Having written fitting programs that support user-supplied algebraic >> constraints between parameters in Fortran77, I find interpreting python's >> AST to be remarkably simple and robust.? I'm scared much more by statistical >> modeling of economic data ;) > > So you use the 'ast' module to convert Python source into a syntax > tree, and then you wrote an interpreter for that syntax tree? Yes (though just for the expressions for the constraint). I accept that this can be viewed as anywhere on the continuum between highly useful and stark raving mad. It's such a fine line between stupid and, uh, ... clever. > ...Wouldn't it be easier to use Python's interpreter instead of > writing your own, i.e., just call eval() on the source code? Probably, though if evaluating an AST tree scares some people, then I think that eval() would scare the rest even more. So the goal was sort of a safe-ish, mathematically-oriented, eval(). > Or are you just using the AST to figure out which variables are referenced? > (I have some code to do just that without going through the ast > module, on the theory that it's nice to be compatible with python 2.5, > but I'm not sure it's really worth it.) I do figure out which variables are referenced, but do more than that. I run ast.parse(expression) prior to the fit, and then evaluate by walking through the resulting tree at each iteration. When reaching an ast.Name node, I look up the name in pre-reserved dictionary. Many python and numpy symbols (sqrt, pi, etc) are automatically included in this 'namespace'. For each fit, the names of all the Parameters are also included prior to the fit. That gives a restricted language and namespace for writing constraints. For a simple example, from lmfit import Parameter Parameter(name='a', value=10, vary=True) Parameter(name='b', expr='sqrt(a) + 1') Parameter(name='c', expr='a + b') the expression for 'b' is parsed more or less (you can do ast.dump(ast.parse('sqrt(a) + 1')) for a more complete result) to: BinOp(op=Add(), left=Call(func=Name('sqrt'), args=[Name('a')]), right=Num(1) ) that tree is simple to walk, with each node evaluated appropriately. Evaluation of such a tree is so easy that, although it is not highly useful for constraint expressions in a fitting problem, the lmfit.asteval code includes support for while and for loops, if-then-else, and try-except, as well as full slicing and attribute lookups for both evaluation and assignment. The ast module does the parsing and gives a walkable tree -- amazing and beautifl, and standard python (for python 2.6+). As mentioned above, the evaluation of these constraints happens at each iteration of the python function to be minimized by scipy.optimize.leastsq, or others. Dependencies (ie, that 'c' depends on 'a' and 'b' and that 'b' depends on 'a') are recorded so that 'b' above is evaluated once per fitting loop. Admittedly, that's probably a minor optimization, but it is in the lmfit/minimizer.py if anyone is looking. This approach gives a user a lot of flexibility in setting up fitting models, and can allow the fitting function to remain relatively static and written in terms of the "physical" model. Of course, it is not going to be as fast as numexpr for array calculations, but it is more general, and faster ufuncs is not the issue at hand. Cheers, --Matt Newville From newville at cars.uchicago.edu Mon Feb 27 14:50:05 2012 From: newville at cars.uchicago.edu (Matt Newville) Date: Mon, 27 Feb 2012 13:50:05 -0600 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: References: <11398732.11.1330269277150.JavaMail.geo-discussion-forums@ynca15> Message-ID: Hi Josef, On Mon, Feb 27, 2012 at 9:56 AM, wrote: > On Sun, Feb 26, 2012 at 10:14 AM, Matthew Newville > wrote: >> >> Hi Erik, Josef, >> >> >> On Saturday, February 25, 2012 8:21:43 PM UTC-6, Erik Petigura wrote: >>> >>> Thanks for getting back to me! >>> I'd like to minimize p1 and p2 together.? Let me try to describe my >>> problem a little better: >>> >>> I'm trying to fit an exoplanet transit light curve.? My model is a box + a >>> polynomial trend. >>> >>> https://gist.github.com/1912265 >>> >>> The polynomial coefficients and the depth of the box are linear >>> parameters, so I want to >>> fit them using linear least squares.? The center and width of the transit >>> are non-linear >>> so I need to fit them with an iterative approach like optimize.fmin. >>> Here's how I implemented it. >>> >>> https://gist.github.com/1912281 >> >> I'm not sure I fully follow your model, but if I understand correctly, >> you're looking to find optimal parameters for something like >> ? model = linear_function(p1) + nonlinear_function(p2) > > yes, I've read about this mostly in the context of semiparametric > versions when the nonlinear function does not have a parametric form. > > http://en.wikipedia.org/wiki/Semiparametric_regression#Partially_linear_models > >> >> for sets of coefficients p1 and p2, each set having a few fitting variables, >> some of which may be related.? Is there an instability that prevents you >> from just treating this as a single non-linear model? > > I think p1 and p2 shouldn't have any cross restrictions or it will get > a bit more complicated. > The main reason for splitting it up is computational, I think. It's > quite common in econometrics to "concentrate out" parameters that have > an explicit solution, so we need the nonlinear optimization only for a > smaller parameter space. > > >> >> Another option might be to have the residual function for >> scipy.optimize.leastsq (or lmfit) call numpy.linalg.lstsq at each >> iteration.? I would think that more fully explore the parameter space than >> first fitting nonlinear_function with scipy.optimize.fmin() then passing >> those best-fit parameters to numpy.linalg.lstsq(), but perhaps I'm not fully >> understanding the nature of the problem. > > That's what both of us did, my version https://gist.github.com/1911544 Right, that does seem to be the preferred solution here, though it wasn't completely clear to me that Erik meant that the parameters in p1 and p2 were always decoupled. I may not have understood the model (and there were a lot of objects named 'p'!). Allowing coupling between some of the elements of p1 and p2 would seem potentially useful to me. >>> There is a lot unpacking and repacking the parameter array as it gets >>> passed around >>> between functions.? One option that might work would be to define >>> functions based on a >>> "parameter object".? This parameter object could have attributes like >>> float/fix, >>> linear/non-linear.? I found a more object oriented optimization module >>> here: >>> http://newville.github.com/lmfit-py/ >>> >>> However, it doesn't allow for linear fitting. >> >> Linear fitting could probably be added to lmfit, though I haven't looked >> into it.?? For this problem, I would pursue the idea of treating your >> fitting problem as a single model for non-linear least squares with >> optimize.leastsq or with lmfit.?? Perhaps I missing something about your >> model that makes this approach unusually challenging. >> >> >> >> Josef P wrote: >> >>> The easiest is to just write some helper functions to stack or unstack >>> the parameters, or set some to fixed. In statsmodels we use this in >>> some cases (as methods since our models are classes), also to >>> transform parameters. >>> Since often this affects groups of parameters, I don't know if the >>> lmfit approach would helps in this case. >> >> If many people who are writing their own model functions find themselves >> writing similar helper functions to stack and unstack parameters, "the >> easiest" here might not be "the best", and providing tools to do this >> stacking and unstacking might be worthwhile.?? Lmfit tries to do this. >> >>> (Personally, I like numpy arrays with masks or fancy indexing, which >>> is easy to understand. Ast manipulation scares me.) >> >> I don't understand how masks or fancy indexing would help here. How would >> that work? > > ---- > a few examples how we currently handle parameter restrictions in statsmodels > > Skippers example: transform parameters in the loglikelihood to force > the parameter estimates of an ARMA process to produce a stationary > (stable) solution - transforms groups of parameters at once > https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/arima_model.py#L230 > https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/arima_model.py#L436 > > not a clean example: fitting distributions with some frozen parameters > https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/sandbox/distributions/sppatch.py#L267 > select parameters that are not frozen to use with fmin > ? ? x0 ?= np.array(x0)[np.isnan(frmask)] > expand the parameters again to include the frozen parameters inside > the loglikelihood function > ? ? theta = frmask.copy() > ? ? theta[np.isnan(frmask)] = thetash > > It's not as user friendly as the version that got into > scipy.stats.distribution, (but it's developer friendly because I don't > have to stare at it for hours to spot a bug) > > structural Vector Autoregression uses a similar mask pattern > https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/vector_ar/svar_model.py#L89 > > In another sandbox model I build a nested dictionary to map the model > parameters to the reduced list that can be fed to > scipy.optimize.fmin_xxx > > But we don't have user defined nonlinear constraints yet. Ah, thanks -- I think I understand what you're doing, at least in some of the examples. You're using certain ranges of an array of parameter values to be treated differently, possibly with some able to be fixed or bounded. As you no doubt understand, the approach in lmfit is completely different, and much more flexible. Instead of the user writing a function for leastsq() that takes as the first argument an array of parameter values, they write a function that takes as the first argument an array of Parameters. At each iteration, the Parameters will have up-to-date value, after apply the bounds and constraint expression as set for each parameter. The point is that someone can write the function once, in terms of named, physical parameters, but then a user change whether any of the parameters are varied/fixed, have bounds, or are constrained to a mathematical expression in terms of other variables without changing the function that calculates the residual. > --------- >> >> FWIW, lmfit uses python's ast module only for algebraic constraints between >> parameters.? That is, >> ??? from lmfit import Parameter >> ??? Parameter(name='a', value=10, vary=True) >> ??? Parameter(name='b', expr='sqrt(a) + 1') >> >> will compile 'sqrt(a)+1' into its AST representation and evaluate that for >> the value of 'b' when needed.? So lmfit doesn't so much manipulate the AST >> as interpret it.? What is? manipulated is the namespace, so that 'a' is >> interpreted as "look up the current value of Parameter 'a'" when the AST is >> evaluated.?? Again, this applies only for algebraic constraints on >> parameters. > > It's a bit similar to a formula framework for specifying a statistical > model that is to be estimated (with lengthy discussion on the > statsmodels list). Sorry, I don't follow that discussion list (a little outside my field), and wasn't aware of a formula framework in statsmodels. How does that compare? > I see the advantages but I haven't spent the weeks of time to figure > out what's behind the machinery that is required (especially given all > the other statistics and econometrics that is missing, and where I > only have to worry about how numpy, scipy and statsmodels behaves. And > I like Zen of Python #2) Fair enough. The code in lmfit/asteval.py and lmfit/astutils.py is < 1000 Lines, is BSD, and imports only from: ast, math, (numpy), os, re, sys, __future__ That is, the import of numpy is tried, and symbols from numpy will be used if available. Python 2.6+ is required, as the ast module changed quite a bit between 2.5 and 2.6. The 'import from __future__' are for division and print_function, for Python3 compatibility. >> Having written fitting programs that support user-supplied algebraic >> constraints between parameters in Fortran77, I find interpreting python's >> AST to be remarkably simple and robust.? I'm scared much more by statistical >> modeling of economic data ;) > > different tastes and applications. > I'd rather think about the next 20 statistical tests I want to code, > than about the AST or how sympy translates into numpy. OK. That's all completely fair. I'm just saying it's not that hard, and also exists if you're interested. Cheers, --Matt Newville From eptune at gmail.com Mon Feb 27 17:31:50 2012 From: eptune at gmail.com (Erik Petigura) Date: Mon, 27 Feb 2012 14:31:50 -0800 Subject: [SciPy-User] Alternatives to scipy.optimize In-Reply-To: References: <11398732.11.1330269277150.JavaMail.geo-discussion-forums@ynca15> Message-ID: <66DE6EB7-19D2-41BF-B2B3-77262940CF49@gmail.com> Thanks for all the suggestions! The discussion was very enlightening. I wound up writing the following wrappers: https://gist.github.com/1927518 I think Matt has the right idea with lmfit: > This approach gives a user a lot of flexibility in setting up fitting > models, and can allow the fitting function to remain relatively static > and written in terms of the "physical" model. Of course, it is not > going to be as fast as numexpr for array calculations, but it is more > general, and faster ufuncs is not the issue at hand. In the end, I decided to fit my linear parameters (polynomial trend) separately from my box parameters, by just excluding the region with the box. Erik -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin.hause at colorado.edu Mon Feb 27 23:54:58 2012 From: benjamin.hause at colorado.edu (Benjamin Hause) Date: Tue, 28 Feb 2012 04:54:58 +0000 (UTC) Subject: [SciPy-User] F2py - after making module program runs differently Message-ID: Hello, I have a fortran code that I made into a module using F2py. The module was successfully created as far as I can tell, and I had only made minor changes (making main a subroutine, changing name of a variable, etc.) that should not affect the program. Basically, when I run the module from python and the program from fortran the module gives slightly different output (which, expectedly gets worse with run time as the difference compounds). My question is, is it possible that something was corrupted when making the module that changes how the program runs? If so, how should I go about determining the problem and fixing this? Is there any common issues that would change my output slightly (error starts in about the third decimal place, works its way up over time). Thanks, Ben From ralf.gommers at googlemail.com Tue Feb 28 01:15:25 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 28 Feb 2012 07:15:25 +0100 Subject: [SciPy-User] ANN: SciPy 0.10.1 released Message-ID: Hi all, I am pleased to announce the availability of SciPy 0.10.1. This is a maintenance release, with no new features compared to 0.10.0. Sources and binaries can be found at http://sourceforge.net/projects/scipy/files/scipy/0.10.1/, release notes are copied below. Enjoy, The SciPy developers ========================== SciPy 0.10.1 Release Notes ========================== .. contents:: SciPy 0.10.1 is a bug-fix release with no new features compared to 0.10.0. Main changes ------------ The most important changes are:: 1. The single precision routines of ``eigs`` and ``eigsh`` in ``scipy.sparse.linalg`` have been disabled (they internally use double precision now). 2. A compatibility issue related to changes in NumPy macros has been fixed, in order to make scipy 0.10.1 compile with the upcoming numpy 1.7.0 release. Other issues fixed ------------------ - #835: stats: nan propagation in stats.distributions - #1202: io: netcdf segfault - #1531: optimize: make curve_fit work with method as callable. - #1560: linalg: fixed mistake in eig_banded documentation. - #1565: ndimage: bug in ndimage.variance - #1457: ndimage: standard_deviation does not work with sequence of indexes - #1562: cluster: segfault in linkage function - #1568: stats: One-sided fisher_exact() returns `p` < 1 for 0 successful attempts - #1575: stats: zscore and zmap handle the axis keyword incorrectly -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Feb 28 02:28:20 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 28 Feb 2012 08:28:20 +0100 Subject: [SciPy-User] Does scipy binary install libgfortran.dylib? In-Reply-To: References: Message-ID: On Mon, Feb 27, 2012 at 1:28 PM, Mark Bakker wrote: > Excellent, so it can be done. > The next question is: How? > For scipy, first the wrappers are generated with f2py, then the extension is built with gfortran with these flags: /usr/local/bin/gfortran -Wall -undefined dynamic_lookup -bundle -arch i386 -arch ppc -Wl,-search_paths_first -Lbuild build/f2pywrapper.o -lgfortran -o build/XXX.so -Wl,-framework -Wl,Accelerate Hope that helps, Ralf > I found a suggestion from Brian Toby on the web from this summer, but that > didn't work form me. This is what I did: > > On my Mac Terminal I type: > > LDFLAGS='-undefined dynamic_lookup -bundle -static-libgfortran -static-libgcc' > > f2py -c -m besselaes besselaes.f95 > > This nicely creates the extension, which I can run on the machine I created it on, but if I move it to a > machine that doesn't have the libgfortran.3.dylib file, it doesn't run and complains about that, so I > > > conclude that the dynamic link failed (the size of the extension doesn't change when I set the LDFLAGS, > which I thought was a bad omen). Any thoughts? Am I doing something wrong? > > Thanks for any help, > > > Mark > > > > > Date: Sun, 26 Feb 2012 17:38:19 +0100 >> From: Ralf Gommers >> Subject: Re: [SciPy-User] Does scipy binary install libgfortran.dylib? >> To: SciPy Users List >> Message-ID: >> > yGca2tuxXmnkSTQXNTeVwTA1C_igwA at mail.gmail.com> >> Content-Type: text/plain; charset="iso-8859-1" >> >> >> On Sun, Feb 26, 2012 at 11:49 AM, Mark Bakker wrote: >> >> > Hello List, >> > >> > Does Scipy install the correct version of libgfortran.dylib? Does it >> > simply put it in /usr/local/lib/ ? >> > >> > I am trying to distribute my own package which includes FORTRAN >> extensions >> > and when installing on a brand new machine it complains that >> > libgfortran.3.dylib cannot be found. I was wondering how Scipy handles >> this >> > (and thanks to Ralf Gommers for helping me so far, but I haven't been >> able >> > to solve this). >> > >> >> Sorry for not being clearer before - I only knew that this worked for >> scipy, not exactly how it worked. After some digging I found this in the >> 0.7.1 release notes: "Mac OS X binary installer is now a proper universal >> build, and does not depend on gfortran anymore (libgfortran is statically >> linked)." >> >> Ralf >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rowen at uw.edu Tue Feb 28 15:14:31 2012 From: rowen at uw.edu (Russell E. Owen) Date: Tue, 28 Feb 2012 12:14:31 -0800 Subject: [SciPy-User] Trouble with numpy on OSX... References: Message-ID: In article , Anthony Palomba wrote: > I am trying to get my scipy environment running on my mac. > I have a MBP running OSX 10.7 with python2.7 (python.org) > installed. > > I installed scipy-0.10.0-py2.7-python.org-macosx10.3 > and numpy-1.6.1-py2.7-python.org-macosx10.3. > > When I try to import multiarray, i get the following error... > > ImportError: > dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-pa > ckages/numpy/core/multiarray.so, > 2): no suitable image found. Did find: > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ > numpy/core/multiarray.so: > no matching architecture in universal wrapper > > Is there something I am missing? Those packages require 32-bit python.org python (10.3 and later). My guess is that you are running 64-bit python.org python (10.6 and later). Either install the 32-bit python available from: The link is titled Mac OS X 32-bit i386/PPC Installer (2.7.2) for Mac OS X 10.3 through 10.6 Or else use the 64-bit binary installers for numpy and scipy (labelled macosx10.6). -- Russell From ralf.gommers at googlemail.com Tue Feb 28 16:29:13 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 28 Feb 2012 22:29:13 +0100 Subject: [SciPy-User] Scipy test failure when building on Scientific Linux 6.0 In-Reply-To: References: Message-ID: On Mon, Feb 27, 2012 at 3:47 PM, Dugan Witherick wrote: > Dear Ralf, > > I've attached the build log and test log to this message. I'm guessing > that mail board will scrub the attachments and replace them with links but > I thought it better to do it this way than fill people's inboxes with long > messages. Please say if you think it is better to just inline the logs. > This worked fine. The build log is certainly too long to put in a mail. All the test failures come from the interpolate module. There are some warnings like warning: "_POSIX_C_SOURCE" redefined which I don't think hurt too much, but indicate something unusual about your setup. There's a bunch more that looks wrong, but I have no idea about the cause: compile options: '-I/usr/lib64/python2.6/site-packages/numpy/core/include -I/usr/include/python2.6 -c' gcc: scipy/interpolate/interpnd.c /usr/lib64/python2.6/site-packages/numpy/core/include/numpy/__multiarray_api.h:1532: warning: ?_import_array? defined but not used /usr/lib64/python2.6/site-packages/numpy/core/include/numpy/__ufunc_api.h:226: warning: ?_import_umath? defined but not used scipy/interpolate/interpnd.c: In function ?__pyx_f_8interpnd__clough_tocher_2d_single_double?: scipy/interpolate/interpnd.c:4383: warning: ?__pyx_v_g1? may be used uninitialized in this function scipy/interpolate/interpnd.c:4384: warning: ?__pyx_v_g2? may be used uninitialized in this function scipy/interpolate/interpnd.c:4385: warning: ?__pyx_v_g3? may be used uninitialized in this function scipy/interpolate/interpnd.c: In function ?__pyx_f_8interpnd__clough_tocher_2d_single_complex?: scipy/interpolate/interpnd.c:4683: warning: ?__pyx_v_g1? may be used uninitialized in this function scipy/interpolate/interpnd.c:4684: warning: ?__pyx_v_g2? may be used uninitialized in this function scipy/interpolate/interpnd.c:4685: warning: ?__pyx_v_g3? may be used uninitialized in this function gcc -pthread -shared build/temp.linux-x86_64-2.6/scipy/interpolate/interpnd.o -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 -lpython2.6 -o build/lib.linux-x86_64-2.6/scipy/interpolate/interpnd.so building 'scipy.interpolate._fitpack' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC creating build/temp.linux-x86_64-2.6/scipy/interpolate/src compile options: '-I/usr/lib64/python2.6/site-packages/numpy/core/include -I/usr/include/python2.6 -c' gcc: scipy/interpolate/src/_fitpackmodule.c In file included from scipy/interpolate/src/_fitpackmodule.c:7: scipy/interpolate/src/__fitpack.h: In function ?fitpack_surfit?: scipy/interpolate/src/__fitpack.h:272: warning: passing argument 18 of ?surfit_? from incompatible pointer type scipy/interpolate/src/__fitpack.h:97: note: expected ?int *? but argument is of type ?npy_intp *? scipy/interpolate/src/__fitpack.h:272: warning: passing argument 20 of ?surfit_? from incompatible pointer type scipy/interpolate/src/__fitpack.h:97: note: expected ?int *? but argument is of type ?npy_intp *? scipy/interpolate/src/__fitpack.h:282: warning: passing argument 18 of ?surfit_? from incompatible pointer type scipy/interpolate/src/__fitpack.h:97: note: expected ?int *? but argument is of type ?npy_intp *? scipy/interpolate/src/__fitpack.h:282: warning: passing argument 20 of ?surfit_? from incompatible pointer type scipy/interpolate/src/__fitpack.h:97: note: expected ?int *? but argument is of type ?npy_intp *? scipy/interpolate/src/__fitpack.h: In function ?fitpack_parcur?: Ralf > Dugan > > On 26 February 2012 21:43, Ralf Gommers wrote: > >> >> >> On Fri, Feb 24, 2012 at 12:48 PM, Dugan Witherick wrote: >> >>> I'm trying to build numpy (1.6.1) and scipy (0.10.1rc2) on Scientific >>> Linux 6.0. I've successfully managed to build both packages from source >>> using >>> >>> python setup.py config_fc --fcompiler=gnu95 install >>> >>> but while numpy passes its tests, scipy doesn't: >>> >>> >>> scipy.test() >>> Running unit tests for scipy >>> NumPy version 1.6.1 >>> NumPy is installed in /usr/lib64/python2.6/site-packages/numpy >>> SciPy version 0.10.1rc2 >>> SciPy is installed in /usr/lib64/python2.6/site-packages/scipy >>> Python version 2.6.6 (r266:84292, May 20 2011, 16:42:11) [GCC 4.4.5 >>> 20110214 (Red Hat 4.4.5-6)] >>> nose version 0.10.4 >>> >>> ---SKIPPED---- >>> >>> ====================================================================== >>> ERROR: test_qhull.TestTriangulation.test_pathological >>> ---------------------------------------------------------------------- >>> Traceback (most recent call last): >>> File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in >>> runTest >>> self.test(*self.arg) >>> File >>> "/usr/lib64/python2.6/site-packages/scipy/spatial/tests/test_qhull.py", >>> line 216, in test_pathological >>> assert_equal(tri.points[tri.vertices].max(), >>> ValueError: zero-size array to maximum.reduce without identity >>> >>> ====================================================================== >>> FAIL: test_interpnd.TestCloughTocher2DInterpolator.test_dense >>> ---------------------------------------------------------------------- >>> Traceback (most recent call last): >>> File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in >>> runTest >>> self.test(*self.arg) >>> File >>> "/usr/lib64/python2.6/site-packages/scipy/interpolate/tests/test_interpnd.py", >>> line 183, in test_dense >>> err_msg="Function %d" % j) >>> File >>> "/usr/lib64/python2.6/site-packages/scipy/interpolate/tests/test_interpnd.py", >>> line 132, in _check_accuracy >>> assert_allclose(a, b, **kw) >>> File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line >>> 1168, in assert_allclose >>> verbose=verbose, header=header) >>> File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line >>> 605, in assert_array_compare >>> chk_same_position(x_id, y_id, hasval='nan') >>> File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line >>> 588, in chk_same_position >>> raise AssertionError(msg) >>> AssertionError: >>> Not equal to tolerance rtol=0.01, atol=0.005 >>> Function 0 >>> x and y nan location mismatch: >>> x: array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, >>> nan, >>> nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, >>> nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, >>> nan,... >>> y: array([ 3.66796999e-02, 1.91605573e-01, 6.08362261e-01, >>> 7.64324844e-02, 9.18031021e-01, 1.28033199e-01, >>> 4.67121584e-01, 1.37085621e-01, 2.53092671e-01,... >>> >>> ---SKIPPED several other fails--- >>> >>> ---------------------------------------------------------------------- >>> Ran 5102 tests in 80.529s >>> >>> FAILED (KNOWNFAIL=13, SKIP=35, errors=1, failures=19) >>> >>> >>> numpy/scipy are being built against lapack (3.2.1), blas (3.2.1) and >>> atlas (3.8.3) from the standard Scientific Linux repository. I would >>> appreciate any advice/suggestions on where I might be going wrong. >>> >> >> Could you post all the test failures and the build log? >> >> Ralf >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doutriaux1 at llnl.gov Tue Feb 28 17:23:12 2012 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Tue, 28 Feb 2012 14:23:12 -0800 Subject: [SciPy-User] Fortran on Mac Message-ID: <4F4D53D0.4000806@llnl.gov> Hi all, I'm assuming this is a good place to ask this question. I'm looking for a good/reliable free fortran compiler on mac (10.6 and/or 10.7). I'm currently using gfortran 4.2.3 from http://r.research.att.com/tools/ In the past I used the gfortran from http://hpc.sourceforge.net/ but I've been told on this list (or numpy list I can't remember) to avoid these. The project I'm on requires gfortran 4.3 or greater. Can anybody point me to a good/recent compiler (preferably gfortran) Thanks, C. From johann.cohentanugi at gmail.com Tue Feb 28 17:23:46 2012 From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi) Date: Tue, 28 Feb 2012 23:23:46 +0100 Subject: [SciPy-User] masking an array ends up flattening it Message-ID: <4F4D53F2.8060807@gmail.com> Hello, I have the following: In [145]: m Out[145]: array([[ 1.82243247e-23, -5.53103453e-14, 4.32071039e-13, 0.00000000e+00], [ -5.52425949e-14, 6.26697129e-02, -5.12076585e-02, 0.00000000e+00], [ 4.31598429e-13, -5.12102340e-02, 6.27539118e-02, 0.00000000e+00], [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 1.00000000e+10]]) In [146]: mask Out[146]: array([[ True, True, True, False], [ True, True, True, False], [ True, True, True, False], [False, False, False, False]], dtype=bool) Naively, I thought I would end up with a (3,3) shaped array when applying the mask to m, but instead I get : In [147]: m[mask] Out[147]: array([ 1.82243247e-23, -5.53103453e-14, 4.32071039e-13, -5.52425949e-14, 6.26697129e-02, -5.12076585e-02, 4.31598429e-13, -5.12102340e-02, 6.27539118e-02]) In [148]: m[mask].shape Out[148]: (9,) Is there another way to proceed and get directly the (3,3) shaped masked array, or do I need to reshape it by hand? thanks a lot in advance, Johann From guziy.sasha at gmail.com Tue Feb 28 17:32:55 2012 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Tue, 28 Feb 2012 17:32:55 -0500 Subject: [SciPy-User] masking an array ends up flattening it In-Reply-To: <4F4D53F2.8060807@gmail.com> References: <4F4D53F2.8060807@gmail.com> Message-ID: Hi, I don't think it is possible, and also not sure if the order will be conserved when you select and reshape. Why do you need it? Think what would you like to get in the case when you have only one element True? Cheers -- Oleksandr Huziy 2012/2/28 Johann Cohen-Tanugi : > Hello, > I have the following: > In [145]: m > Out[145]: > array([[ ?1.82243247e-23, ?-5.53103453e-14, ? 4.32071039e-13, > ? ? ? ? ? 0.00000000e+00], > ? ? ? ?[ -5.52425949e-14, ? 6.26697129e-02, ?-5.12076585e-02, > ? ? ? ? ? 0.00000000e+00], > ? ? ? ?[ ?4.31598429e-13, ?-5.12102340e-02, ? 6.27539118e-02, > ? ? ? ? ? 0.00000000e+00], > ? ? ? ?[ ?0.00000000e+00, ? 0.00000000e+00, ? 0.00000000e+00, > ? ? ? ? ? 1.00000000e+10]]) > > In [146]: mask > Out[146]: > array([[ True, ?True, ?True, False], > ? ? ? ?[ True, ?True, ?True, False], > ? ? ? ?[ True, ?True, ?True, False], > ? ? ? ?[False, False, False, False]], dtype=bool) > > Naively, I thought I would end up with a (3,3) shaped array when > applying the mask to m, but instead I get : > > In [147]: m[mask] > Out[147]: > array([ ?1.82243247e-23, ?-5.53103453e-14, ? 4.32071039e-13, > ? ? ? ? -5.52425949e-14, ? 6.26697129e-02, ?-5.12076585e-02, > ? ? ? ? ?4.31598429e-13, ?-5.12102340e-02, ? 6.27539118e-02]) > > In [148]: m[mask].shape > Out[148]: (9,) > > Is there another way to proceed and get directly the (3,3) shaped masked > array, or do I need to reshape it by hand? > > thanks a lot in advance, > Johann > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From doutriaux1 at llnl.gov Tue Feb 28 17:32:51 2012 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Tue, 28 Feb 2012 14:32:51 -0800 Subject: [SciPy-User] Fortran on Mac In-Reply-To: <4F4D53FC.10201@txcorp.com> References: <4F4D53D0.4000806@llnl.gov> <4F4D53FC.10201@txcorp.com> Message-ID: <4F4D5613.2030007@llnl.gov> Thanks Alex, Looks like it's just what I need! C. On 2/28/12 2:23 PM, Alexander Pletzer wrote: > Charles, > > I'm not a mac user but I would try > > http://gcc.gnu.org/wiki/GFortranBinaries#MacOS > > --Alex > > On 02/28/2012 03:23 PM, Charles Doutriaux wrote: >> Hi all, >> >> I'm assuming this is a good place to ask this question. >> >> I'm looking for a good/reliable free fortran compiler on mac (10.6 >> and/or 10.7). >> >> I'm currently using gfortran 4.2.3 from http://r.research.att.com/tools/ >> >> In the past I used the gfortran from http://hpc.sourceforge.net/ but >> I've been told on this list (or numpy list I can't remember) to avoid >> these. >> >> The project I'm on requires gfortran 4.3 or greater. >> >> Can anybody point me to a good/recent compiler (preferably gfortran) >> >> Thanks, >> >> C. From zachary.pincus at yale.edu Tue Feb 28 17:35:11 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Tue, 28 Feb 2012 17:35:11 -0500 Subject: [SciPy-User] masking an array ends up flattening it In-Reply-To: <4F4D53F2.8060807@gmail.com> References: <4F4D53F2.8060807@gmail.com> Message-ID: Hi Johann, > In [146]: mask > Out[146]: > array([[ True, True, True, False], > [ True, True, True, False], > [ True, True, True, False], > [False, False, False, False]], dtype=bool) > > Naively, I thought I would end up with a (3,3) shaped array when > applying the mask to m So that would make some sense for the above mask, but obviously doesn't generalize... what shape output would you expect if 'mask' looked like the following? array([[ True, True, True, False], [ True, True, True, False], [ True, True, True, False], [False, False, False, True]], dtype=bool) Flattening turns out to be the most-sensible general-case thing to do. Fortunately, this is generally not a problem, because often one winds up doing things like: a[mask] = b[mask] where a and b can both be n-dimensional, and the fact that you go through a flattened intermediate is no problem. If, on the other hand, your task requires slicing square regions out of arrays, you could do that directly by other sorts of fancy-indexing or using programatically-generated slice objects, or some such. Can you describe the overall task? Perhaps then someone could suggest the "idiomatic numpy" solution? Zach > , but instead I get : > > In [147]: m[mask] > Out[147]: > array([ 1.82243247e-23, -5.53103453e-14, 4.32071039e-13, > -5.52425949e-14, 6.26697129e-02, -5.12076585e-02, > 4.31598429e-13, -5.12102340e-02, 6.27539118e-02]) > > In [148]: m[mask].shape > Out[148]: (9,) > > Is there another way to proceed and get directly the (3,3) shaped masked > array, or do I need to reshape it by hand? > > thanks a lot in advance, > Johann > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From caraciol at gmail.com Tue Feb 28 21:47:26 2012 From: caraciol at gmail.com (Marcel Caraciolo) Date: Tue, 28 Feb 2012 23:47:26 -0300 Subject: [SciPy-User] Facing a problem with integrations Message-ID: Hi all, My name is Marcel and I am lecturing scientific computing with Python here at Brazil. One of my students came to me with a problem that he is currently solving it with matlab but he decided to change his code to Python (thanks to the course!) The problem is calculate numerically the coefficients aim that are defined by the following integral [1]. It must be calculated using integrals. In the example showed above he wants to use the trapezoid rule adapted for 2-D arrays or if there is any another solutions easily with scipy it would be match perfectly also. Here is the matrix input (U) and the corresponding coefficients (solution). The goal is to calculate the corresponding coefficients by the formula (integral) shown at [1]. Could anyone give some a solution using scipy.integrate ? I tried several proposals but it didn't worked. [1] http://dl.dropbox.com/u/1977573/pic1.png [2] http://dl.dropbox.com/u/1977573/pic2.png Regards, -- Marcel Pinheiro Caraciolo M.S.C. Candidate at CIN/UFPE http://www.mobideia.com http://aimotion.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kasoft1010 at gmail.com Tue Feb 28 23:31:43 2012 From: kasoft1010 at gmail.com (Abdu Adil) Date: Wed, 29 Feb 2012 12:31:43 +0800 Subject: [SciPy-User] Convolution of sinus signal and rectangular pulse Message-ID: I would like to perform the operation of convolution of sinus signal and rectangular pulse, in scipy, I convolved sinus signal with cosinus signal and ploted that on the graph, but I would like to know how to create array with rectangular pulse, something similar to this matlab expression y = rectpulse(x,nsamp), so I can convolve them, i use this to create my sinus and cosinus signal x=r_[0:50] (my array) y01=sin(2*pi*x/49) y02=cos(2*pi*x/49) So i tried to create a nu.zeros(50), and manually changing the zeros from position 15-25 from 0.0. to 0.9 so it looks like rectangle but convolution on sinus array and this 'rectangle' array is weird, It is supposed to be zero when there is no intersection but i get sinus signal in return, here is the code http://tinypaste.com/4791061a I apologize in advance, i feel like this is the easiest thing but I could not find any reference on how to create a rectangular pulse, Im doing computer science and I just started this course signals and systems and Im a bit lame i know :D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From johann.cohentanugi at gmail.com Wed Feb 29 02:38:45 2012 From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi) Date: Wed, 29 Feb 2012 08:38:45 +0100 Subject: [SciPy-User] masking an array ends up flattening it In-Reply-To: References: <4F4D53F2.8060807@gmail.com> Message-ID: <4F4DD605.6060008@gmail.com> Hi Zach, thanks a lot. I should know by now that naive expectations that are not met in numpy are generally so for lack of generalization! Your example makes perfect sense. My use case is a covariance matrix that has the dimension of all the parameters available, but some of them are fix in a fit, and I have a bool array that tells me which parameters are fixed. I then would like to "extract" the covariance matrix of the free parameters. I would rather go for masking and then reshaping than fancy indexing, which if too fancy start scaring me Of course if there is a clean solution, I am all ears. thanks again, johann On 02/28/2012 11:35 PM, Zachary Pincus wrote: > Hi Johann, > >> In [146]: mask >> Out[146]: >> array([[ True, True, True, False], >> [ True, True, True, False], >> [ True, True, True, False], >> [False, False, False, False]], dtype=bool) >> >> Naively, I thought I would end up with a (3,3) shaped array when >> applying the mask to m > > So that would make some sense for the above mask, but obviously doesn't generalize... what shape output would you expect if 'mask' looked like the following? > > array([[ True, True, True, False], > [ True, True, True, False], > [ True, True, True, False], > [False, False, False, True]], dtype=bool) > > Flattening turns out to be the most-sensible general-case thing to do. Fortunately, this is generally not a problem, because often one winds up doing things like: > a[mask] = b[mask] > where a and b can both be n-dimensional, and the fact that you go through a flattened intermediate is no problem. > > If, on the other hand, your task requires slicing square regions out of arrays, you could do that directly by other sorts of fancy-indexing or using programatically-generated slice objects, or some such. Can you describe the overall task? Perhaps then someone could suggest the "idiomatic numpy" solution? > > Zach > > > >> , but instead I get : >> >> In [147]: m[mask] >> Out[147]: >> array([ 1.82243247e-23, -5.53103453e-14, 4.32071039e-13, >> -5.52425949e-14, 6.26697129e-02, -5.12076585e-02, >> 4.31598429e-13, -5.12102340e-02, 6.27539118e-02]) >> >> In [148]: m[mask].shape >> Out[148]: (9,) >> >> Is there another way to proceed and get directly the (3,3) shaped masked >> array, or do I need to reshape it by hand? >> >> thanks a lot in advance, >> Johann >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From paul.anton.letnes at gmail.com Wed Feb 29 03:26:08 2012 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Wed, 29 Feb 2012 08:26:08 +0000 Subject: [SciPy-User] Fortran on Mac In-Reply-To: <4F4D5613.2030007@llnl.gov> References: <4F4D53D0.4000806@llnl.gov> <4F4D53FC.10201@txcorp.com> <4F4D5613.2030007@llnl.gov> Message-ID: Hi, I built gfortran 4.6.2 from source with the formula in Homebrew-alt. It doesn't work 100% with scipy, but it compiles my own (pure) fortran project decently well. Fortunately I've got different machines for production runs... I don't quite trust this gfortran. The official gcc binaries are probably a better choice, though. Paul On 28. feb. 2012, at 22:32, Charles Doutriaux wrote: > Thanks Alex, > > Looks like it's just what I need! > > C. > > > On 2/28/12 2:23 PM, Alexander Pletzer wrote: >> Charles, >> >> I'm not a mac user but I would try >> >> http://gcc.gnu.org/wiki/GFortranBinaries#MacOS >> >> --Alex >> >> On 02/28/2012 03:23 PM, Charles Doutriaux wrote: >>> Hi all, >>> >>> I'm assuming this is a good place to ask this question. >>> >>> I'm looking for a good/reliable free fortran compiler on mac (10.6 >>> and/or 10.7). >>> >>> I'm currently using gfortran 4.2.3 from http://r.research.att.com/tools/ >>> >>> In the past I used the gfortran from http://hpc.sourceforge.net/ but >>> I've been told on this list (or numpy list I can't remember) to avoid >>> these. >>> >>> The project I'm on requires gfortran 4.3 or greater. >>> >>> Can anybody point me to a good/recent compiler (preferably gfortran) >>> >>> Thanks, >>> >>> C. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From bronger at physik.rwth-aachen.de Wed Feb 29 03:32:45 2012 From: bronger at physik.rwth-aachen.de (Torsten Bronger) Date: Wed, 29 Feb 2012 09:32:45 +0100 Subject: [SciPy-User] umfpack runs out of memory Message-ID: <87r4xejaz6.fsf@physik.rwth-aachen.de> Hall?chen! I currently deal with large matrices. I use the SciPy package of Ubuntu 10.04. The umfpack package is called libumfpack5.4.0. The Python process uses approx. 6 GB of Memory and aborts. The traceback is Traceback (most recent call last): ... File "/usr/lib/python2.6/dist-packages/scipy/sparse/linalg/dsolve/linsolve.py", line 88, in spsolve autoTranspose = True ) File "/usr/lib/python2.6/dist-packages/scipy/sparse/linalg/dsolve/umfpack/umfpack.py", line 582, in linsolve self.numeric( mtx ) File "/usr/lib/python2.6/dist-packages/scipy/sparse/linalg/dsolve/umfpack/umfpack.py", line 431, in numeric umfStatus[status]) RuntimeError: failed with UMFPACK_ERROR_out_of_memory I suspect a 4 GB limit somewhere. Can I do something about it? Tsch?, Torsten. -- Torsten Bronger Jabber ID: torsten.bronger at jabber.rwth-aachen.de or http://bronger-jmp.appspot.com From xabart at gmail.com Wed Feb 29 04:49:15 2012 From: xabart at gmail.com (Xavier Barthelemy) Date: Wed, 29 Feb 2012 20:49:15 +1100 Subject: [SciPy-User] umfpack runs out of memory In-Reply-To: <87r4xejaz6.fsf@physik.rwth-aachen.de> References: <87r4xejaz6.fsf@physik.rwth-aachen.de> Message-ID: sometimes an unix or linux is configured with a max limit of resources by user. the standard case is 4GB ram per process try "ulimit" and check what is the answer (if installed). if this is the case a "ulimit unlimited" should answer the problem Xavier 2012/2/29 Torsten Bronger : > Hall?chen! > > I currently deal with large matrices. ?I use the SciPy package of > Ubuntu 10.04. ?The umfpack package is called libumfpack5.4.0. > > The Python process uses approx. 6 GB of Memory and aborts. ?The > traceback is > > Traceback (most recent call last): > ?... > ?File "/usr/lib/python2.6/dist-packages/scipy/sparse/linalg/dsolve/linsolve.py", line 88, in spsolve > ? ?autoTranspose = True ) > ?File "/usr/lib/python2.6/dist-packages/scipy/sparse/linalg/dsolve/umfpack/umfpack.py", line 582, in linsolve > ? ?self.numeric( mtx ) > ?File "/usr/lib/python2.6/dist-packages/scipy/sparse/linalg/dsolve/umfpack/umfpack.py", line 431, in numeric > ? ?umfStatus[status]) > RuntimeError: failed with UMFPACK_ERROR_out_of_memory > > I suspect a 4 GB limit somewhere. ?Can I do something about it? > > Tsch?, > Torsten. > > -- > Torsten Bronger ? ?Jabber ID: torsten.bronger at jabber.rwth-aachen.de > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?or http://bronger-jmp.appspot.com > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- ?? Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacr? des droits et le plus indispensable des devoirs ? D?claration des droits de l'homme et du citoyen, article 35, 1793 From bronger at physik.rwth-aachen.de Wed Feb 29 05:06:36 2012 From: bronger at physik.rwth-aachen.de (Torsten Bronger) Date: Wed, 29 Feb 2012 11:06:36 +0100 Subject: [SciPy-User] umfpack runs out of memory In-Reply-To: References: <87r4xejaz6.fsf@physik.rwth-aachen.de> Message-ID: <87mx82j6mr.fsf@physik.rwth-aachen.de> Hall?chen! Xavier Barthelemy writes: > sometimes an unix or linux is configured with a max limit of > resources by user. the standard case is 4GB ram per process try > "ulimit" and check what is the answer (if installed). if this is > the case a "ulimit unlimited" should answer the problem ulimit already tells me "unlimited". Tsch?, Torsten. -- Torsten Bronger Jabber ID: torsten.bronger at jabber.rwth-aachen.de or http://bronger-jmp.appspot.com From josef.pktd at gmail.com Wed Feb 29 09:43:34 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 29 Feb 2012 09:43:34 -0500 Subject: [SciPy-User] a bit OT: running 2to3 on examples and docs automatically Message-ID: I'm looking for a recipe how to run the 2to3 conversion automatically on extra folders, like scripts in docs folder and the examples folder. How are other packages handling this? Where should I look? I just followed scikits.image for adding 2to3 to setup.py for the source, but I didn't see anything about converting the examples. Thanks, Josef From pletzer at txcorp.com Tue Feb 28 17:23:56 2012 From: pletzer at txcorp.com (Alexander Pletzer) Date: Tue, 28 Feb 2012 15:23:56 -0700 Subject: [SciPy-User] Fortran on Mac In-Reply-To: <4F4D53D0.4000806@llnl.gov> References: <4F4D53D0.4000806@llnl.gov> Message-ID: <4F4D53FC.10201@txcorp.com> Charles, I'm not a mac user but I would try http://gcc.gnu.org/wiki/GFortranBinaries#MacOS --Alex On 02/28/2012 03:23 PM, Charles Doutriaux wrote: > Hi all, > > I'm assuming this is a good place to ask this question. > > I'm looking for a good/reliable free fortran compiler on mac (10.6 > and/or 10.7). > > I'm currently using gfortran 4.2.3 from http://r.research.att.com/tools/ > > In the past I used the gfortran from http://hpc.sourceforge.net/ but > I've been told on this list (or numpy list I can't remember) to avoid > these. > > The project I'm on requires gfortran 4.3 or greater. > > Can anybody point me to a good/recent compiler (preferably gfortran) > > Thanks, > > C. From nur-idura.amran at wct.com.my Tue Feb 28 23:16:44 2012 From: nur-idura.amran at wct.com.my (idura) Date: Wed, 29 Feb 2012 12:16:44 +0800 Subject: [SciPy-User] Speeding up Python Again Message-ID: <000001ccf698$f2797610$d76c6230$@wct.com.my> -------------- next part -------------- An HTML attachment was scrubbed... URL: From johann.cohen-tanugi at univ-montp2.fr Wed Feb 29 02:24:04 2012 From: johann.cohen-tanugi at univ-montp2.fr (Johann Cohen-Tanugi) Date: Wed, 29 Feb 2012 08:24:04 +0100 Subject: [SciPy-User] masking an array ends up flattening it In-Reply-To: References: <4F4D53F2.8060807@gmail.com> Message-ID: <4F4DD294.3040107@univ-montp2.fr> Hi Zach, thanks a lot. I should know by now that naive expectations that are not met in numpy are generally so for lack of generalization! Your example makes perfect sense. My use case is a covariance matrix that has the dimension of all the parameters available, but some of them are fix in a fit, and I have a bool array that tells me which parameters are fixed. I then would like to "extract" the covariance matrix of the free parameters. I would rather go for masking and then reshaping than fancy indexing, which if too fancy start scaring me :) Of course if there is a clean solution, I am all ears. thanks again, johann On 02/28/2012 11:35 PM, Zachary Pincus wrote: > Hi Johann, > >> In [146]: mask >> Out[146]: >> array([[ True, True, True, False], >> [ True, True, True, False], >> [ True, True, True, False], >> [False, False, False, False]], dtype=bool) >> >> Naively, I thought I would end up with a (3,3) shaped array when >> applying the mask to m > > So that would make some sense for the above mask, but obviously doesn't generalize... what shape output would you expect if 'mask' looked like the following? > > array([[ True, True, True, False], > [ True, True, True, False], > [ True, True, True, False], > [False, False, False, True]], dtype=bool) > > Flattening turns out to be the most-sensible general-case thing to do. Fortunately, this is generally not a problem, because often one winds up doing things like: > a[mask] = b[mask] > where a and b can both be n-dimensional, and the fact that you go through a flattened intermediate is no problem. > > If, on the other hand, your task requires slicing square regions out of arrays, you could do that directly by other sorts of fancy-indexing or using programatically-generated slice objects, or some such. Can you describe the overall task? Perhaps then someone could suggest the "idiomatic numpy" solution? > > Zach > > > >> , but instead I get : >> >> In [147]: m[mask] >> Out[147]: >> array([ 1.82243247e-23, -5.53103453e-14, 4.32071039e-13, >> -5.52425949e-14, 6.26697129e-02, -5.12076585e-02, >> 4.31598429e-13, -5.12102340e-02, 6.27539118e-02]) >> >> In [148]: m[mask].shape >> Out[148]: (9,) >> >> Is there another way to proceed and get directly the (3,3) shaped masked >> array, or do I need to reshape it by hand? >> >> thanks a lot in advance, >> Johann >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From robert.kern at gmail.com Wed Feb 29 11:09:47 2012 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 29 Feb 2012 16:09:47 +0000 Subject: [SciPy-User] Fortran on Mac In-Reply-To: <4F4D53FC.10201@txcorp.com> References: <4F4D53D0.4000806@llnl.gov> <4F4D53FC.10201@txcorp.com> Message-ID: On Tue, Feb 28, 2012 at 22:23, Alexander Pletzer wrote: > Charles, > > I'm not a mac user but I would try > > http://gcc.gnu.org/wiki/GFortranBinaries#MacOS The builds hosted on this page do not support the Apple-specific flags necessary for linking extension modules for most Mac Python distributions (specifically framework builds). -- Robert Kern From zachary.pincus at yale.edu Wed Feb 29 11:50:26 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Wed, 29 Feb 2012 11:50:26 -0500 Subject: [SciPy-User] masking an array ends up flattening it In-Reply-To: <4F4DD294.3040107@univ-montp2.fr> References: <4F4D53F2.8060807@gmail.com> <4F4DD294.3040107@univ-montp2.fr> Message-ID: <967723B6-D7D4-45BF-8A8C-3422D0F71782@yale.edu> > Hi Zach, thanks a lot. I should know by now that naive expectations that are not met in numpy are generally so for lack of generalization! Your example makes perfect sense. > My use case is a covariance matrix that has the dimension of all the parameters available, but some of them are fix in a fit, and I have a bool array that tells me which parameters are fixed. I then would like to "extract" the covariance matrix of the free parameters. > > I would rather go for masking and then reshaping than fancy indexing, which if too fancy start scaring me :) > Of course if there is a clean solution, I am all ears. OK, so you have a list of parameter indices that are "good" and you want to get the sub-matrix out corresponding to just the rows and columns at those indices? E.g.: a = numpy.arange(25).reshape((5,5)) print a array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]]) Then, say you want to get the sub-matrix of 'a' corresponding to rows/columns 1 and 3? Is this equivalent to what you need to do? That is, you want the following: array([[ 6, 8], [16, 18]]) For this you might think to do the following: a[[1,3], [1,3]] but this returns 'array([ 6, 18])' -- you have pulled out a flat list of two elements, at indices [1,1] and [3,3]... This sort of fancy indexing is VERY useful in many cases, but not the case you want, which is more like a "cross product" sort of indexing problem. It turns out that what you really want is: a[ [[1,1],[3,3]], [[1,3],[1,3]] ] which yields: array([[ 6, 8], [16, 18]]) This makes sense -- you pass in a two 2D arrays, one containing the x-coords and one the y-coords, and you get out a 2D array of the same shape. Perhaps-insanely, the above can be simplified to: a[ [[1],[3]], [[1,3]] ] If you understand numpy broadcasting rules, you may see how: [[1],[3]], [[1,3]] broadcasts to be the same as: [[1,1],[3,3]], [[1,3],[1,3]] Fortunately, all of this mind-bending stuff is can be done behind the scenes with a cross-product indexing helper function: a[ numpy.ix_([1,3], [1,3]) ] takes care of it for you, and gives the desired array([[ 6, 8], [16, 18]]) This is all pretty advanced-sounding stuff... but most of it's laid out in sections 5 and 6 of the tentative tutorial: http://www.scipy.org/Tentative_NumPy_Tutorial You might also want to peruse St?fan's advanced numpy tutorial -- the broadcasting and indexing sections are really useful. http://mentat.za.net/numpy/numpy_advanced_slides/ Zach > thanks again, > johann > > On 02/28/2012 11:35 PM, Zachary Pincus wrote: >> Hi Johann, >> >>> In [146]: mask >>> Out[146]: >>> array([[ True, True, True, False], >>> [ True, True, True, False], >>> [ True, True, True, False], >>> [False, False, False, False]], dtype=bool) >>> >>> Naively, I thought I would end up with a (3,3) shaped array when >>> applying the mask to m >> >> So that would make some sense for the above mask, but obviously doesn't generalize... what shape output would you expect if 'mask' looked like the following? >> >> array([[ True, True, True, False], >> [ True, True, True, False], >> [ True, True, True, False], >> [False, False, False, True]], dtype=bool) >> >> Flattening turns out to be the most-sensible general-case thing to do. Fortunately, this is generally not a problem, because often one winds up doing things like: >> a[mask] = b[mask] >> where a and b can both be n-dimensional, and the fact that you go through a flattened intermediate is no problem. >> >> If, on the other hand, your task requires slicing square regions out of arrays, you could do that directly by other sorts of fancy-indexing or using programatically-generated slice objects, or some such. Can you describe the overall task? Perhaps then someone could suggest the "idiomatic numpy" solution? >> >> Zach >> >> >> >>> , but instead I get : >>> >>> In [147]: m[mask] >>> Out[147]: >>> array([ 1.82243247e-23, -5.53103453e-14, 4.32071039e-13, >>> -5.52425949e-14, 6.26697129e-02, -5.12076585e-02, >>> 4.31598429e-13, -5.12102340e-02, 6.27539118e-02]) >>> >>> In [148]: m[mask].shape >>> Out[148]: (9,) >>> >>> Is there another way to proceed and get directly the (3,3) shaped masked >>> array, or do I need to reshape it by hand? >>> >>> thanks a lot in advance, >>> Johann >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From ralf.gommers at googlemail.com Wed Feb 29 15:28:33 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 29 Feb 2012 21:28:33 +0100 Subject: [SciPy-User] a bit OT: running 2to3 on examples and docs automatically In-Reply-To: References: Message-ID: On Wed, Feb 29, 2012 at 3:43 PM, wrote: > I'm looking for a recipe how to run the 2to3 conversion automatically > on extra folders, like scripts in docs folder and the examples folder. > > How are other packages handling this? Where should I look? > > I just followed scikits.image for adding 2to3 to setup.py for the > source, but I didn't see anything about converting the examples. > I don't have an example to point you to, but there's probably not more to it then writing a small script that finds all .rst and .txt doc files and runs 2to3 on each file with the -d flag. I'd expect examples ending in .py to be found already, but if not just find them in your examples folder in that same script. A comment in the statsmodels setup.py on how to convert with 2to3 might be useful, I was expecting to find a call there to a numpy style conversion script. I found it in README.txt in the end, but who reads READMEs these days? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Feb 29 15:36:19 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 29 Feb 2012 15:36:19 -0500 Subject: [SciPy-User] a bit OT: running 2to3 on examples and docs automatically In-Reply-To: References: Message-ID: On Wed, Feb 29, 2012 at 3:28 PM, Ralf Gommers wrote: > > > On Wed, Feb 29, 2012 at 3:43 PM, wrote: >> >> I'm looking for a recipe how to run the 2to3 conversion automatically >> on extra folders, like scripts in docs folder and the examples folder. >> >> How are other packages handling this? Where should I look? >> >> I just followed scikits.image for adding 2to3 to setup.py for the >> source, but I didn't see anything about converting the examples. > > > I don't have an example to point you to, but there's probably not more to it > then writing a small script that finds all .rst and .txt doc files and runs > 2to3 on each file with the -d flag. I'd expect examples ending in .py to be > found already, but if not just find them in your examples folder in that > same script. Something like this I thought of doing, but seeing how someone else is doing it would save some time. > > A comment in the statsmodels setup.py on how to convert with 2to3 might be > useful, I was expecting to find a call there to a numpy style conversion > script. I found it in README.txt in the end, but who reads READMEs these > days? oops, I didn't read the README.txt either I just changed the setup.py to run 2to3 automatically, but didn't think of updating any documentation yet. Thanks, Josef > > Ralf > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From wesmckinn at gmail.com Wed Feb 29 19:31:28 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Wed, 29 Feb 2012 19:31:28 -0500 Subject: [SciPy-User] ANN: pandas 0.7.1 released Message-ID: hi all, I'm happy to announce the pandas 0.7.1 release. This is primarily a bugfix release from 0.7.0, but includes a couple notable performance enhancements and a handful of new functions and features. Source archives and Windows installers are now available on PyPI. Major work is underway for pandas 0.8.0, likely to be released at the end of April. For example, the time series capabilities are seeing significant work, incorporating the features which have been available in scikits.timeseries but not in pandas. See the issue tracker for a full of list planned new features and performance/infrastructural improvements. If you are interested in becoming more involved with the project, the issue tracker (which is really the TODO list!) is the best place to start. http://github.com/pydata/pandas/issues?milestone=2&state=open Thanks to all who contributed to this release! - Wes What is it ========== pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with ?relational? or ?labeled? data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Links ===== Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst Documentation: http://pandas.pydata.org Installers: http://pypi.python.org/pypi/pandas Code Repository: http://github.com/pydata/pandas Mailing List: http://groups.google.com/group/pystatsmodels Blog: http://blog.wesmckinney.com From fperez.net at gmail.com Wed Feb 29 19:35:27 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 29 Feb 2012 16:35:27 -0800 Subject: [SciPy-User] [pystatsmodels] ANN: pandas 0.7.1 released In-Reply-To: References: Message-ID: On Wed, Feb 29, 2012 at 4:31 PM, Wes McKinney wrote: > > I'm happy to announce the pandas 0.7.1 release. This is primarily a > bugfix release from 0.7.0, but includes a couple notable performance > enhancements and a handful of new functions and features. Source > archives and Windows installers are now available on PyPI. Congrats! An awesome tool keeps getting better... Thanks a lot for the relentless improvements. Cheers, f From mark.pundurs at nokia.com Wed Feb 29 11:01:26 2012 From: mark.pundurs at nokia.com (Pundurs Mark (Nokia-LC/Chicago)) Date: Wed, 29 Feb 2012 10:01:26 -0600 Subject: [SciPy-User] ImportError: *.so: cannot open shared object file: No such file or directory In-Reply-To: References: Message-ID: <8A18D8FA4293104C9A710494FD6C273CB737ACD8@hq-ex-mb03.ad.navteq.com> Adding to LD_LIBRARY_PATH didn't help. I added /usr/lib (where I can see the *.so files) but got the same ImportError; then just in case, I added /tools/python/2.6.3_3/linux_x86_64/lib and /tools/python/2.6.3_3/linux_x86_64/lib/python2.6/site-packages (where scipy lives) but still got the ImportError. Any other ideas on how to debug or work around this? (I'm trying to dig through the sys elements discussed in http://docs.python.org/reference/simple_stmts.html#import, but it's thorny stuff.) > Date: Sat, 12 Nov 2011 01:02:19 +0100 > From: Paul Anton Letnes > > Assuming bash, type this into your shell to export the variable for as > long as you keep your shell running. If you want it to stick > permanently, add the line to ~/.bashrc. > > export LD_LIBRARY_PATH=/folder/that/contains/libs:$LD_LIBRARY_PATH > > Cheers > Paul > > > Thanks, David! How do I (a Linux newbie) add paths to environment > variable LD_LIBRARY_PATH? > > > > Hi Mark, > > > > On Wed, Nov 2, 2011 at 3:38 PM, Pundurs, Mark > wrote: > >> I want to use the function stats.norm.isf, but no matter how I try > to import it I end up with the error "ImportError: .so: cannot > open shared object file: No such file or directory". The .so files > cited do exist in /usr/lib (as symbolic links to other .so files that > also exist in that directory). From what I've read, that's where > they're supposed to be - but I think the Python installation is in a > nonstandard location. Is that the problem? How can I work around it? > > > > I believe RHEL 4 uses g77 as its default fortran compiler, so you > have > > a custom gfortran build somewhere, am I right ? > > > > If so, you need to add the paths where libgfortran.so and > liblapack.so > > are to the environment variable LD_LIBRARY_PATH. Given that scipy has > > been built (by someone else for you ?), you may want to ask them > about > > it for the exact locations of those libraries. The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.