From josef.pktd at gmail.com Tue Jan 1 17:18:52 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 1 Jan 2013 17:18:52 -0500 Subject: [SciPy-User] effect size instead of p-values Message-ID: A classical alternative to NHST (null-hypothesis significance testing): report your effect size and confidence intervals http://en.wikipedia.org/wiki/Effect_size http://onlinelibrary.wiley.com/doi/10.1111/j.1469-185X.2007.00027.x/abstract http://onlinelibrary.wiley.com/doi/10.1111/j.1460-9568.2011.07902.x/abstract I bumped into this while looking for power analysis. MIA in python, as far as I can tell. Josef From cyd at gnu.org Wed Jan 2 07:28:48 2013 From: cyd at gnu.org (Chong Yidong) Date: Wed, 02 Jan 2013 20:28:48 +0800 Subject: [SciPy-User] eigsh on sparse tridiagonal matrix: 2 orders of magnitude slower than Octave?? Message-ID: <87zk0rlqf3.fsf@gnu.org> I'd like to use scipy.sparse.linalg.eigsh() to solve the standard tridiagonal matrix from the finite-difference method (2's on the diagonal, -1's on the off-diagonals). Strangely, it is running MUCH slower than the equivalent code on GNU Octave---3.8 seconds as opposed to 0.04 seconds in the following example---and I don't know why. In the following SciPy code, eigsh takes 3.84 seconds to run: import scipy.sparse as sp import scipy.sparse.linalg as lin import time Nx = 1500 H = 2.0 * sp.eye(Nx, Nx, format='lil') Hoff1 = sp.eye(Nx, Nx, k=1, format='lil') Hoff2 = sp.eye(Nx, Nx, k=-1, format='lil') H = H - Hoff1 - Hoff2 H = H.tocsr() t = time.time() E = lin.eigsh(H, 25, which='SM', return_eigenvectors=False) print time.time() - t In the following equivalent Octave code, eigs takes 0.04 seconds: N = 1500; f = ones(N-1,1); H = 2.0 * eye(N) - diag(f,1) - diag(f,-1); H = sparse(H); tic; v = eigs(H, 25, 'sm'); toc; Surely it's the same ARPACK underneath, so how can this be happening? Anyone have any idea? I'm using SciPy 0.10.1 on Ubuntu x86-64. SciPy 0.9 also had the same problem. Comparison was made with Octave 3.6.2 running on the same machine. From pav at iki.fi Wed Jan 2 14:08:57 2013 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 02 Jan 2013 21:08:57 +0200 Subject: [SciPy-User] eigsh on sparse tridiagonal matrix: 2 orders of magnitude slower than Octave?? In-Reply-To: <87zk0rlqf3.fsf@gnu.org> References: <87zk0rlqf3.fsf@gnu.org> Message-ID: 02.01.2013 14:28, Chong Yidong kirjoitti: [clip] > tic; v = eigs(H, 25, 'sm'); toc; [clip] In octave, 'sm' has a different meaning --- it uses shift-invert with sigma=0 rather than Arpack's small-eigenvalue finding: octave-3.6.2/src/DLD-FUNCTIONS/eigs.cc: 573 // Mode 1 for SM mode seems unstable for some reason. 574 // Use Mode 3 instead, with sigma = 0. 575 if (!error_state && !have_sigma && typ == "SM") 576 have_sigma = true; In general, answers to questions to "why does octave do XX" can be found reading its source code. From cintas.celia at gmail.com Thu Jan 3 20:57:59 2013 From: cintas.celia at gmail.com (Celia) Date: Thu, 3 Jan 2013 22:57:59 -0300 Subject: [SciPy-User] SciPy Conf Argentina Question Message-ID: <20130103225759.6c55764c@gmail.com> I am writing because we are organizing the First 'Python in Science' in Argentina, scheduled from 16-18 May 2013 in Puerto Madryn [1] (South of Argentina). This Conference will be a joint effort of the National University San Juan Bosco [2], Government of Puerto Madryn, SciPy Argentine Community in collaboration with several local and nationals sponsors. ... I would like to know if it could be possible to use SciPy as part of the name of the conference. Our web site will be up in a few days at [3]. Thanks in advance! Celia Cintas [1] http://www.madryn.gov.ar/turismo/2010/galeria/index.php [2] http://www.unp.edu.ar/ [3] http://www.scipycon.com.ar/ From pierre.raybaut at gmail.com Sat Jan 5 08:38:06 2013 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Sat, 5 Jan 2013 14:38:06 +0100 Subject: [SciPy-User] ANN: Spyder v2.1.13 Message-ID: Hi all, On the behalf of Spyder's development team (http://code.google.com/p/spyderlib/people/list), I'm pleased to announce that Spyder v2.1.13 has been released and is available for Windows XP/Vista/7, GNU/Linux and MacOS X: http://code.google.com/p/spyderlib/ This is a pure maintenance release -- a lot of bugs were fixed since v2.1.11 (v2.1.12 was released exclusively inside WinPython distribution): http://code.google.com/p/spyderlib/wiki/ChangeLog Spyder is a free, open-source (MIT license) interactive development environment for the Python language with advanced editing, interactive testing, debugging and introspection features. Originally designed to provide MATLAB-like features (integrated help, interactive console, variable explorer with GUI-based editors for dictionaries, NumPy arrays, ...), it is strongly oriented towards scientific computing and software development. Thanks to the `spyderlib` library, Spyder also provides powerful ready-to-use widgets: embedded Python console (example: http://packages.python.org/guiqwt/_images/sift3.png), NumPy array editor (example: http://packages.python.org/guiqwt/_images/sift2.png), dictionary editor, source code editor, etc. Description of key features with tasty screenshots can be found at: http://code.google.com/p/spyderlib/wiki/Features On Windows platforms, Spyder is also available as a stand-alone executable (don't forget to disable UAC on Vista/7). This all-in-one portable version is still experimental (for example, it does not embed sphinx -- meaning no rich text mode for the object inspector) but it should provide a working version of Spyder for Windows platforms without having to install anything else (except Python 2.x itself, of course). Don't forget to follow Spyder updates/news: * on the project website: http://code.google.com/p/spyderlib/ * and on our official blog: http://spyder-ide.blogspot.com/ Last, but not least, we welcome any contribution that helps making Spyder an efficient scientific development/computing environment. Join us to help creating your favourite environment! (http://code.google.com/p/spyderlib/wiki/NoteForContributors) Enjoy! -Pierre From ralf.gommers at gmail.com Sat Jan 5 12:18:41 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 5 Jan 2013 18:18:41 +0100 Subject: [SciPy-User] numpy.test() fail mac with homebrew python 2.7 In-Reply-To: References: <190AEC12-2F22-461D-8ED0-378801B65ADF@umich.edu> Message-ID: On Thu, Dec 27, 2012 at 10:39 PM, Ralf Gommers wrote: > > > > On Thu, Dec 27, 2012 at 10:09 PM, Adam Schneider wrote: > >> Hey Ralf, thanks for the quick response. >> >> I tried a clean install, but scipy still fails to install with this error: >> numpy.distutils.npy_pkg_config.PkgNotFound: Could not find file(s) >> ['/usr/local/lib/python2.7/site-packages/numpy/core/lib/npy-pkg-config/npymath.ini'] >> >> It looks like it is the same issue that people are running into here: >> http://stackoverflow.com/questions/12574604/scipy-install-on-mountain-lion-failing >> >> Trying the workaround suggested on SO and installing numpy from top of >> tree allows scipy to install from pip, but numpy.test() fails the following >> test: >> >> >> > >> Also, should scipy from pip (0.11.0) install cleanly with numpy from pip >> (1.6.2)? >> > > It should, but apparently it's broken in a not very reproducible way. > That's going to take some time to fix probably. This sort of thing is why I > usually avoid pip & co by the way. > This is a regression in pip on OS X: https://github.com/pypa/pip/issues/707. Pip 1.0 and 1.1 should work, pip 1.2 is broken. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.collette at gmail.com Mon Jan 7 11:39:16 2013 From: andrew.collette at gmail.com (Andrew Collette) Date: Mon, 7 Jan 2013 09:39:16 -0700 Subject: [SciPy-User] ANN: HDF5 for Python (h5py) 2.1.1 Message-ID: Announcing HDF5 for Python (h5py) 2.1.1 ======================================= HDF5 for Python 2.1.1 is now available! This bugfix release also marks a number of changes for the h5py project intended to make the development process more responsive, including a move to GitHub and a switch to a rapid release model. Development has moved over to GitHub at http://github.com/h5py/h5py. We welcome bug reports and pull requests from anyone interested in contributing. Releases will now be made every 4-6 weeks, in order to get bugfixes and new features out to users quickly while still leaving time for testing. * New main website: http://www.h5py.org * Mailing list: http://groups.google.com/group/h5py What is h5py? ============= The h5py package is a Pythonic interface to the HDF5 binary data format. It lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays. Thousands of datasets can be stored in a single file, categorized and tagged however you want. H5py uses straightforward NumPy and Python metaphors, like dictionary and NumPy array syntax. For example, you can iterate over datasets in a file, or check out the .shape or .dtype attributes of datasets. You don't need to know anything special about HDF5 to get started. In addition to the easy-to-use high level interface, h5py rests on a object-oriented Cython wrapping of the HDF5 C API. Almost anything you can do from C in HDF5, you can do from h5py. Best of all, the files you create are in a widely-used standard binary format, which you can exchange with other people, including those who use programs like IDL and MATLAB. What's new in 2.1.1? ==================== This is a bugfix release. The most substantial changes were: * Fixed a memory leak related to variable-length strings (Thanks to Luke Campbell for extensive testing and bug reports) * Fixed a threading deadlock related to the use of H5Aiterate * Fixed a double INCREF memory leak affecting Unicode variable-length strings * Fixed an exception when taking the repr() of objects with non-ASCII names From ralf.gommers at gmail.com Mon Jan 7 12:43:20 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 7 Jan 2013 18:43:20 +0100 Subject: [SciPy-User] SciPy Conf Argentina Question In-Reply-To: <20130103225759.6c55764c@gmail.com> References: <20130103225759.6c55764c@gmail.com> Message-ID: On Fri, Jan 4, 2013 at 2:57 AM, Celia wrote: > I am writing because we are organizing the First 'Python in Science' > in Argentina, scheduled from 16-18 May 2013 in Puerto Madryn [1] (South > of Argentina). This Conference will be a joint effort of the National > University San Juan Bosco [2], Government of Puerto Madryn, SciPy > Argentine Community in collaboration with several local and nationals > sponsors. > ... I would like to know if it could be possible to use SciPy as part of > the name of the > conference. > Hi Celia, great to hear that a Python in Science conference is organized in Argentina. I hope one of the organizers of SciPy / EuroSciPy / SciPy India will comment, but I don't really see an issue with using SciPy in the conference name. Cheers, Ralf > > Our web site will be up in a few days at [3]. > > Thanks in advance! > > Celia Cintas > > [1] http://www.madryn.gov.ar/turismo/2010/galeria/index.php > [2] http://www.unp.edu.ar/ > [3] http://www.scipycon.com.ar/ > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From terribleangel at gmail.com Mon Jan 7 15:39:42 2013 From: terribleangel at gmail.com (Will) Date: Mon, 7 Jan 2013 20:39:42 +0000 (UTC) Subject: [SciPy-User] Fwd: scipy.test() fails for 0.11.0 on OS X 10.8.2 References: Message-ID: > The "DeprecationWarning: non-integer scalar index" errors are due > to a very recent change in numpy master for which scipy still has to > be updated. They're not a problem, the code being tested will only stop > working with numpy 1.9. > > For the failures there's no good solution on OS X 10.8 except for > not building against the Accelerate Framework but against Netlib > BLAS/LAPACK instead. A description of how to do that can be > found at the bottom of http://projects.scipy.org/scipy/ticket/1737. > > Ralf I am not sure that I am familiar enough with compiling source code to follow all of the flags set in the method described in ticket 1737. How serious are these failures? If I am just doing simple calculations, am I likely to encounter problems if I use the version of scipy on OS X 10.8.2 compiled against the Accelerate Framework? PS Sorry if this ends up getting double posted. I tried to send it without being a member of the list at first, so I think it got rejected. From ralf.gommers at gmail.com Mon Jan 7 15:51:06 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 7 Jan 2013 21:51:06 +0100 Subject: [SciPy-User] Fwd: scipy.test() fails for 0.11.0 on OS X 10.8.2 In-Reply-To: References: Message-ID: On Mon, Jan 7, 2013 at 9:39 PM, Will wrote: > > The "DeprecationWarning: non-integer scalar index" errors are due > > to a very recent change in numpy master for which scipy still has to > > be updated. They're not a problem, the code being tested will only stop > > working with numpy 1.9. > > > > For the failures there's no good solution on OS X 10.8 except for > > not building against the Accelerate Framework but against Netlib > > BLAS/LAPACK instead. A description of how to do that can be > > found at the bottom of http://projects.scipy.org/scipy/ticket/1737. > > > > Ralf > > I am not sure that I am familiar enough with compiling > source code to follow all of the flags set in the > method described in ticket 1737. How serious are > these failures? If I am just doing simple calculations, > am I likely to encounter problems if I use the version > of scipy on OS X 10.8.2 compiled against the > Accelerate Framework? > Not all that likely. Almost all failures are in ARPACK (sparse linear algebra), so if you don't use that you're OK most likely. Another option is to use a binary installer, either a scipy one from sourceforge, or something like EPD which installs the whole stack. Both should work fine. > > PS Sorry if this ends up getting double posted. I tried to send it > without being a member of the list at first, so I think it got > rejected. No worries. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From emmanuelle.gouillart at normalesup.org Mon Jan 7 17:37:44 2013 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Mon, 7 Jan 2013 23:37:44 +0100 Subject: [SciPy-User] SciPy Conf Argentina Question In-Reply-To: <20130103225759.6c55764c@gmail.com> References: <20130103225759.6c55764c@gmail.com> Message-ID: <20130107223744.GA9427@phare.normalesup.org> Hi Celia, I'm sure nobody will object to your conference using the name scipy. The different conferences on Scientific Python (http://conference.scipy.org/index.html) use the name, so you can use it too if your conference shares the same purposes. Thanks for promoting the use of Scientific Python in Argentina! Cheers, Emmanuelle On Thu, Jan 03, 2013 at 10:57:59PM -0300, Celia wrote: > I am writing because we are organizing the First 'Python in Science' > in Argentina, scheduled from 16-18 May 2013 in Puerto Madryn [1] (South > of Argentina). This Conference will be a joint effort of the National > University San Juan Bosco [2], Government of Puerto Madryn, SciPy > Argentine Community in collaboration with several local and nationals > sponsors. > ... I would like to know if it could be possible to use SciPy as part of the name of the > conference. > Our web site will be up in a few days at [3]. > Thanks in advance! > Celia Cintas > [1] http://www.madryn.gov.ar/turismo/2010/galeria/index.php > [2] http://www.unp.edu.ar/ > [3] http://www.scipycon.com.ar/ > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From fperez.net at gmail.com Mon Jan 7 19:31:14 2013 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 7 Jan 2013 16:31:14 -0800 Subject: [SciPy-User] SciPy Conf Argentina Question In-Reply-To: <20130107223744.GA9427@phare.normalesup.org> References: <20130103225759.6c55764c@gmail.com> <20130107223744.GA9427@phare.normalesup.org> Message-ID: On Mon, Jan 7, 2013 at 2:37 PM, Emmanuelle Gouillart wrote: > I'm sure nobody will object to your conference using the name scipy. The > different conferences on Scientific Python > (http://conference.scipy.org/index.html) use the name, so you can use it > too if your conference shares the same purposes. Thanks for promoting the > use of Scientific Python in Argentina! Quite the opposite, hopefully once you have a site set up it will be linked to permanently from the main landing page: http://conference.scipy.org/index.html so that we can give it more visibility via the main site. It's great to see the (already very active) Argentinian Python community veering more in the scientific direction, thanks for your efforts! Best, f From cintas.celia at gmail.com Mon Jan 7 22:15:35 2013 From: cintas.celia at gmail.com (Celia) Date: Tue, 8 Jan 2013 00:15:35 -0300 Subject: [SciPy-User] SciPy Conf Argentina Question In-Reply-To: References: <20130103225759.6c55764c@gmail.com> <20130107223744.GA9427@phare.normalesup.org> Message-ID: <20130108001535.6f8d39f5@gmail.com> El Mon, 7 Jan 2013 16:31:14 -0800 Fernando Perez escribi?: > On Mon, Jan 7, 2013 at 2:37 PM, Emmanuelle Gouillart > wrote: > > I'm sure nobody will object to your conference using the name > > scipy. The different conferences on Scientific Python > > (http://conference.scipy.org/index.html) use the name, so you can > > use it too if your conference shares the same purposes. Thanks for > > promoting the use of Scientific Python in Argentina! > > Quite the opposite, hopefully once you have a site set up it will be > linked to permanently from the main landing page: > > http://conference.scipy.org/index.html > > so that we can give it more visibility via the main site. It's great > to see the (already very active) Argentinian Python community veering > more in the scientific direction, thanks for your efforts! That would be great ! In a few weeks we will have the website (complete) and flyers for the call of talks/posters... Thanks for all the answers! Cheers, Celia From terribleangel at gmail.com Tue Jan 8 08:41:26 2013 From: terribleangel at gmail.com (Will) Date: Tue, 8 Jan 2013 13:41:26 +0000 (UTC) Subject: [SciPy-User] Fwd: scipy.test() fails for 0.11.0 on OS X 10.8.2 References: Message-ID: Ralf Gommers gmail.com> writes: > > Another option is to use a binary installer, either a scipy > one from sourceforge, or something like EPD which > installs the whole stack. Both should work fine. > Thanks for the suggestions. I would like to use the scipy installer, but it gives me the "Scipy requires System Python 2.7" message. I have Python 2.7 from python.org installed (via Homebrew). Is there any way to show the installer that Python is installed (is it looking for python in a particular directory?)? I tried EPD and the tests completed with only skips and known failures (and a bunch of warnings). I would prefer to get scipy working with the Python I already have set up though. Will From cournape at gmail.com Tue Jan 8 11:35:04 2013 From: cournape at gmail.com (David Cournapeau) Date: Tue, 8 Jan 2013 10:35:04 -0600 Subject: [SciPy-User] Fwd: scipy.test() fails for 0.11.0 on OS X 10.8.2 In-Reply-To: References: Message-ID: On Tue, Jan 8, 2013 at 7:41 AM, Will wrote: > Ralf Gommers gmail.com> writes: > >> >> Another option is to use a binary installer, either a scipy >> one from sourceforge, or something like EPD which >> installs the whole stack. Both should work fine. >> > > Thanks for the suggestions. I would like to use the scipy > installer, but it gives me the "Scipy requires System > Python 2.7" message. I have Python 2.7 from python.org > installed (via Homebrew). Is there any way to show the > installer that Python is installed (is it looking for python > in a particular directory?)? I tried EPD and the tests > completed with only skips and known failures (and a > bunch of warnings). I would prefer to get scipy working > with the Python I already have set up though. You can't install the official scipy binaries on top of homebrew python, only on top of the *binary* python installer on python.org David From terribleangel at gmail.com Tue Jan 8 14:36:01 2013 From: terribleangel at gmail.com (Will) Date: Tue, 8 Jan 2013 19:36:01 +0000 (UTC) Subject: [SciPy-User] Fwd: scipy.test() fails for 0.11.0 on OS X 10.8.2 References: Message-ID: David Cournapeau gmail.com> writes: > > You can't install the official scipy binaries on top of homebrew > python, only on top of the *binary* python installer on python.org > > David > I looked up the install location (/Library/Frameworks) of the binary python installer from python.org and put a symbolic link in there to the homebrew python directory (/usr/bin/Cellar/python/2.7.3/Frameworks/Python.framework). After that I was able to run the scipy binary (scipy-0.11.0-py2.7-python.org-macosx10.6.dmg). I ran scipy.test('full') with this scipy and got basically the same results as when I installed scipy 0.11.0 from source with "python setup.py install" (in both cases, the final line was: FAILED (KNOWNFAIL=15, SKIP=43, errors=1, failures=73)). Is that result surprising? I am using numpy 1.6.2 which I installed from source and which completes its tests without failure. From ralf.gommers at gmail.com Wed Jan 9 15:56:38 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 9 Jan 2013 21:56:38 +0100 Subject: [SciPy-User] Fwd: scipy.test() fails for 0.11.0 on OS X 10.8.2 In-Reply-To: References: Message-ID: On Tue, Jan 8, 2013 at 8:36 PM, Will wrote: > David Cournapeau gmail.com> writes: > > > > > You can't install the official scipy binaries on top of homebrew > > python, only on top of the *binary* python installer on python.org > > > > David > > > > > I looked up the install location (/Library/Frameworks) of the binary python > installer from python.org and put a symbolic link in there to the homebrew > python directory > (/usr/bin/Cellar/python/2.7.3/Frameworks/Python.framework). > After that I was able to run the scipy binary > (scipy-0.11.0-py2.7-python.org-macosx10.6.dmg). I ran scipy.test('full') > with this scipy and got basically the same results as when I installed > scipy > 0.11.0 from source with "python setup.py install" (in both cases, the > final line > was: FAILED (KNOWNFAIL=15, SKIP=43, errors=1, failures=73)). Is that > result > surprising? A little. IIRC these binaries (compiled on OS X 10.6) worked fine on 10.7, so something changed in again for 10.8. We really should stop using Accelerate for the official binaries. Ralf > I am using numpy 1.6.2 which I installed from source and which > completes its tests without failure. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.raybaut at gmail.com Wed Jan 9 16:05:23 2013 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Wed, 9 Jan 2013 22:05:23 +0100 Subject: [SciPy-User] ANN: first previews of WinPython for Python 3 32/64bit Message-ID: Hi all, I'm pleased to announce that the first previews of WinPython for Python 3 32bit and 64bit are available (WinPython v3.3.0.0alpha1): http://code.google.com/p/winpython/ This first release based on Python 3 required to migrate the following libraries which were only available for Python 2: * formlayout 1.0.12 * guidata 1.6.0dev1 * guiqwt 2.3.0dev1 * Spyder 2.1.14dev Please note that these libraries are still development release. [Special thanks to Christoph Gohlke for patching and building a version of PyQwt compatible with Python 3.3] WinPython is a free open-source portable distribution of Python for Windows, designed for scientists. It is a full-featured (see http://code.google.com/p/winpython/wiki/PackageIndex) Python-based scientific environment: * Designed for scientists (thanks to the integrated libraries NumPy, SciPy, Matplotlib, guiqwt, etc.: * Regular *scientific users*: interactive data processing and visualization using Python with Spyder * *Advanced scientific users and software developers*: Python applications development with Spyder, version control with Mercurial and other development tools (like gettext) * *Portable*: preconfigured, it should run out of the box on any machine under Windows (without any installation requirements) and the folder containing WinPython can be moved to any location (local, network or removable drive) * *Flexible*: one can install (or should I write "use" as it's portable) as many WinPython versions as necessary (like isolated and self-consistent environments), even if those versions are running different versions of Python (2.7, 3.x in the near future) or different architectures (32bit or 64bit) on the same machine * *Customizable*: using the integrated package manager (wppm, as WinPython Package Manager), it's possible to install, uninstall or upgrade Python packages (see http://code.google.com/p/winpython/wiki/WPPM for more details on supported package formats). *WinPython is not an attempt to replace Python(x,y)*, this is just something different (see http://code.google.com/p/winpython/wiki/Roadmap): more flexible, easier to maintain, movable and less invasive for the OS, but certainly less user-friendly, with less packages/contents and without any integration to Windows explorer [*]. [*] Actually there is an optional integration into Windows explorer, providing the same features as the official Python installer regarding file associations and context menu entry (this option may be activated through the WinPython Control Panel), and adding shortcuts to Windows Start menu. Enjoy! -Pierre From jrocher at enthought.com Wed Jan 9 17:32:28 2013 From: jrocher at enthought.com (Jonathan Rocher) Date: Wed, 9 Jan 2013 16:32:28 -0600 Subject: [SciPy-User] [SCIPY2013] Feedback on mini-symposia themes Message-ID: Dear community members, We are working hard to organize the SciPy2013 conference (Scientific Computing with Python) , this June 24th-29th in Austin, TX. We would like to probe the community about the themes you would be interested in contributing to or participating in for the mini-symposia at SciPy2013. These mini-symposia are held to discuss scientific computing applied to a specific *scientific domain/industry* during a half afternoon after the general conference. Their goal is to promote industry specific libraries and tools, and gather people with similar interests for discussions. For example, the SciPy2012 edition successfully hosted 4 mini-symposia on Astronomy/Astrophysics, Bio-informatics, Meteorology, and Geophysics. Please join us and voice your opinion to shape the next SciPy conference at: http://www.surveygizmo.com/s3/1114631/SciPy-2013-Themes Thanks, The Scipy2013 organizers -- Jonathan Rocher, PhD Scientific software developer Enthought, Inc. jrocher at enthought.com 1-512-536-1057 http://www.enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From silva at lma.cnrs-mrs.fr Thu Jan 10 08:00:36 2013 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Thu, 10 Jan 2013 14:00:36 +0100 Subject: [SciPy-User] Spline with constraint Message-ID: <1357822836.5794.5.camel@laptop-101> Hi, I am currently smoothing a set of data points using scipy.interpolate.UnivariateSpline. It seems OK while only considering the interval delimited by the data. The trouble is that the behaviour gets wild when looking beyond the extrema. Would it be possible to force the resulting the spline to pass through a pair of specified extra points ? It seems trivial for Bezier curves, but is there a tool in scipy for spline? Thanks, From pawel.kw at gmail.com Thu Jan 10 09:24:29 2013 From: pawel.kw at gmail.com (=?ISO-8859-2?Q?Pawe=B3_Kwa=B6niewski?=) Date: Thu, 10 Jan 2013 15:24:29 +0100 Subject: [SciPy-User] Spline with constraint In-Reply-To: <1357822836.5794.5.camel@laptop-101> References: <1357822836.5794.5.camel@laptop-101> Message-ID: Hi Fabrice, You can just add the additional points you want the interpolated curve to go through - for example using scipy.insert (at the beginning of the array) or scipy.append (at the end of the array). I suppose that anyhow the fitted spline might go crazy sooner or later - by adding additional data points you just move this away from real data. Cheers, Pawe? 2013/1/10 Fabrice Silva > Hi, > I am currently smoothing a set of data points using > scipy.interpolate.UnivariateSpline. It seems OK while only considering > the interval delimited by the data. The trouble is that the behaviour > gets wild when looking beyond the extrema. Would it be possible to force > the resulting the spline to pass through a pair of specified extra > points ? > It seems trivial for Bezier curves, but is there a tool in scipy for > spline? > > Thanks, > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pawel.kw at gmail.com Thu Jan 10 09:41:25 2013 From: pawel.kw at gmail.com (=?ISO-8859-2?Q?Pawe=B3_Kwa=B6niewski?=) Date: Thu, 10 Jan 2013 15:41:25 +0100 Subject: [SciPy-User] cubic spline interpolation - derivative value in an end point In-Reply-To: References: Message-ID: Hi, Refreshing this old topic - I'm encountering another problem here. The hack provided for smoothed spline fitting works well if I want to fix the derivative just at one end of the data set. But what if I need to do it at both ends? I was trying to rewrite this method, but it just doesn't work. What I tried to do, is to add a second step of spline fitting - first, I extend my data by 1 point at the beginning and fit the y value to get the spline with the desired derivative, then I tried to use this new data set and extend it again - this time at the end, repeating the procedure to get the derivative fixed at the end. For some reason the second step ruins the derivative clamping done in the first step. Any ideas why this might happen? Cheers, Pawe? 2012/12/14 Pawe? Kwa?niewski > Hi, > > Actually, I'm having a similar problem now. Your hack should do the job - > thanks Eric. > > Cheers, > > Pawe? > > > > 2012/11/27 Maxim > >> Thank you a lot, Eric! >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Sat Jan 12 07:54:10 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 12 Jan 2013 13:54:10 +0100 Subject: [SciPy-User] curve_fit with float32 values Message-ID: Hi all, while trying to show a colleague who hadn't done any Python before how easy it is to load and fit some data, we had the problem that curve_fit didn't appear to do any fitting at all. In the end (which took quite a while!) we found that the problem was that the X data (which was directly loaded from a HDF file) had a float32 dtype. This seems to confuse curve_fit. Same goes for float16. float128 at least raises an exception. Integer types seem fine given rounding, see the code/output below. If it's a big task to make curve_fit work with float32, then at least a warning would be appreciated if the input types won't work. cheers, Georg Test code below: from numpy import sqrt, exp, pi, random, linspace, array from scipy.optimize import curve_fit def gauss(x, b, a, c, w): return b + a / sqrt(2*pi)*exp(-(x-c)**2/(2*w**2)) op = array([1.0, 50.0, 88.0, 1.0]) print 'Original: ', op guess = array([0.0, 40.0, 90.0, 2.0]) print 'Guess: ', guess x0 = linspace(80, 100, 500) y = gauss(x0, *op) for dt in ['float64', 'float32', 'float16', 'int64', 'int32']: x = x0.astype(dt) p, c = curve_fit(gauss, x, y, guess) print 'Fit (%-7s):' % dt, p ***** Output: Original: [ 1. 50. 88. 1.] Guess: [ 0. 40. 90. 2.] Fit (float64): [ 1. 50. 88. -1.] Fit (float32): [ 2.87328455e-07 4.00000000e+01 9.00000000e+01 2.00000000e+00] Fit (float16): [ 0. 40. 90. 2.] Fit (int64 ): [ 0.99921103 48.00234957 87.50399972 1.03986117] Fit (int32 ): [ 0.99921103 48.00234957 87.50399972 1.03986117] From pav at iki.fi Sat Jan 12 09:58:59 2013 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 12 Jan 2013 16:58:59 +0200 Subject: [SciPy-User] curve_fit with float32 values In-Reply-To: References: Message-ID: 12.01.2013 14:54, Georg Brandl kirjoitti: [clip] > In the end (which took quite a while!) we found that the problem > was that the X data (which was directly loaded from a HDF file) > had a float32 dtype. This seems to confuse curve_fit. Same goes > for float16. float128 at least raises an exception. Integer types > seem fine given rounding, see the code/output below. AFAIK, a likely bug here is probably the choice of epsilon for numerical differentiation in leastsq --- the chosen epsilon is probably smaller than the machine epsilon for float32, hence problems appear. -- Pauli Virtanen From josef.pktd at gmail.com Sat Jan 12 22:38:04 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 12 Jan 2013 22:38:04 -0500 Subject: [SciPy-User] Power Message-ID: Oh, it's just power of statistical hypothesis tests experiment: how to solve for root with respect to each argument of a function first steps in power and sample size calculations. coming (eventually) to statsmodels.stats Dumping a script, in case anyone is interested. Josef ----------------- # -*- coding: utf-8 -*- """Statistical power, solving for nobs, ... - trial version Created on Sat Jan 12 21:48:06 2013 Author: Josef Perktold Example roundtrip - root with respect to all variables calculated, desired nobs 33.367204205 33.367204205 effect 0.5 0.5 alpha 0.05 0.05 beta 0.8 0.8 """ import numpy as np from scipy import stats #power d = 0.68 #effect size nobs = 20 #observations alpha = 0.05 alpha_two = alpha / 2. crit = stats.t.isf(alpha_two, nobs-1) #critical value at df=nobs pow_t = stats.nct(nobs-1, d*np.sqrt(nobs)).sf(stats.t.isf(alpha_two, nobs-1)) pow_f = stats.ncf(1, nobs-1, d**2*nobs).sf(stats.f.isf(alpha, 1, nobs-1)) def ttest_power(effect_size, nobs, alpha, df=None, alternative='two-sided'): '''Calculate power of a ttest ''' d = effect_size if df is None: df = nobs - 1 if alternative == 'two-sided': alpha_ = alpha / 2. #no inplace changes, doesn't work else: alpha_ = alpha pow_ = stats.nct(df, d*np.sqrt(nobs)).sf(stats.t.isf(alpha_, df)) return pow_ from scipy import optimize def solve_power_nobs(effect_size, alpha, beta): '''solve for required sample size for t-test to obtain power beta ''' pow_n = lambda nobs: ttest_power(effect_size, nobs, alpha) - beta return optimize.fsolve(pow_n, 10).item() effect_size, alpha, beta = 0.5, 0.05, 0.8 nobs_p = solve_power_nobs(effect_size, alpha, beta) print nobs_p print ttest_power(effect_size, nobs_p, alpha), beta #generic implementation #---------------------- #to be reused for other tests def ttest_power_id(effect_size=None, nobs=None, alpha=None, beta=None, df=None, alternative='two-sided'): '''identity for power calculation, should be zero ''' return ttest_power(effect_size=effect_size, nobs=nobs, alpha=alpha, df=df, alternative=alternative) - beta #import functools as ft #too limited start_ttp = dict(effect_size=0.01, nobs=10., alpha=0.15, beta=0.6) #possible rootfinding problem for effect_size, starting small seems to work def solve_ttest_power(**kwds): '''solve for any one of the parameters of a t-test for t-test the keywords are: effect_size, nobs, alpha, beta exactly one needs to be `None`, all others need numeric values ''' #TODO: maybe use explicit kwds, # nicer but requires inspect? and not generic across tests #TODO: use explicit calculation for beta=None key = [k for k,v in kwds.iteritems() if v is None] #print kwds, key; if len(key) != 1: raise ValueError('need exactly one keyword that is None') key = key[0] def func(x): kwds[key] = x return ttest_power_id(**kwds) return optimize.fsolve(func, start_ttp[key]).item() #scalar print '\nroundtrip - root with respect to all variables' print '\n calculated, desired' print 'nobs ', solve_ttest_power(effect_size=effect_size, nobs=None, alpha=alpha, beta=beta), nobs_p print 'effect', solve_ttest_power(effect_size=None, nobs=nobs_p, alpha=alpha, beta=beta), effect_size print 'alpha ', solve_ttest_power(effect_size=effect_size, nobs=nobs_p, alpha=None, beta=beta), alpha print 'beta ', solve_ttest_power(effect_size=effect_size, nobs=nobs_p, alpha=alpha, beta=None), beta -------------- From dineshbvadhia at hotmail.com Sun Jan 13 07:06:44 2013 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Sun, 13 Jan 2013 04:06:44 -0800 Subject: [SciPy-User] Sparse matrix scalability Message-ID: For 64-bit systems, has anyone or group published figures on scaling a scipy sparse matrix calculation (eg. b <- Ax) ie. what is the performance as the number of rows, columns and nnz increase linearly for very large number of rows (>100m) and columns? Thx. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sonicatedboom-s at yahoo.com Sun Jan 13 11:08:50 2013 From: sonicatedboom-s at yahoo.com (Jackson Li) Date: Sun, 13 Jan 2013 16:08:50 +0000 (UTC) Subject: [SciPy-User] Weighted KDE References: <0CE25948-3ED7-4928-9C63-4F44366C3AF5@yale.edu> Message-ID: gmail.com> writes: > > On Sun, May 13, 2012 at 1:07 PM, Zachary Pincus yale.edu> wrote: > > Hello all, > > > > A while ago, someone asked on this list about whether it would be simple to modify > scipy.stats.kde.gaussian_kde to deal with weighted data: > > http://mail.scipy.org/pipermail/scipy-user/2008-November/018578.html > > > > Anne and Robert assured the writer that this was pretty simple (modulo bandwidth selection), though I > couldn't find any code that the original author may have generated based on that advice. > > > > I've got a problem that could (perhaps) be solved neatly with weighed KDE, so I'd like to give this a go. I > assume that at a minimum, to get basic gaussian_kde.evaluate() functionality: > > > > (1) The covariance calculation would need to be replaced by a weighted- covariance calculation. (Simple enough.) > > > > (2) In evaluate(), the critical part looks like this (and a similar stanza that loops over the points instead): > > # if there are more points than data, so loop over data > > for i in range(self.n): > > ? ?diff = self.dataset[:, i, newaxis] - points > > ? ?tdiff = dot(self.inv_cov, diff) > > ? ?energy = sum(diff*tdiff,axis=0) / 2.0 > > ? ?result = result + exp(-energy) > > > > I assume that, further, the 'diff' values ought to be scaled by the weights, too. Is this all that would need > to be done? (For the integration and resampling, obviously, there would be a bit of other work...) > > it looks to me that way, scaled according to weight by dataset points > > I don't see what the norm_factor should be: > self._norm_factor = sqrt(linalg.det(2*pi*self.covariance)) * self.n > there should be the weights somewhere in there, maybe just replace > self.n by sum(weights) given a constant covariance > > sampling doesn't look difficult, if we want biased sampling, then > instead of randint, we would need weighted randint (non-uniform) > > integration might require more work, or not (I never tried to understand them) > > (I don't know if kde in statsmodels has weights on the schedule.) > > Josef > mostly guessing > > > > > Thanks, > > Zach > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > Hi, I am facing the same problem as well, but can't figure out how the weighting should be done exactly. Has anybody successfully completed the modification of the code to allow a weighted kde? I am attempting to perform kde on a set of imaging data with X, Y, and an additional "temperature" column. Performing the kde on only the X,Y axes gives a working heatmap showing the spatial distribution of the data points, but I would also like to use them to see the "temperature" profile (the third axis), much like a geographical heatmap showing temperature or rainfall values over a X-Y map. I found another set of code from http://pastebin.com/LNdYCZgw which allows weighted kde, but when I tried it out with my data, it took much longer than the normal kde (>1 hour) when the original code took only a about twenty seconds (despite claims that it was faster). Thanks, Jackson From joferkington at gmail.com Sun Jan 13 11:44:36 2013 From: joferkington at gmail.com (Joe Kington) Date: Sun, 13 Jan 2013 10:44:36 -0600 Subject: [SciPy-User] Weighted KDE In-Reply-To: References: <0CE25948-3ED7-4928-9C63-4F44366C3AF5@yale.edu> Message-ID: On Sun, Jan 13, 2013 at 10:08 AM, Jackson Li wrote: > gmail.com> writes: > > > > > On Sun, May 13, 2012 at 1:07 PM, Zachary Pincus > yale.edu> > wrote: > > > Hello all, > > > > > > A while ago, someone asked on this list about whether it would be > simple to > modify > > scipy.stats.kde.gaussian_kde to deal with weighted data: > > > http://mail.scipy.org/pipermail/scipy-user/2008-November/018578.html > > > > > > Anne and Robert assured the writer that this was pretty simple (modulo > bandwidth selection), though I > > couldn't find any code that the original author may have generated based > on > that advice. > > > > > > I've got a problem that could (perhaps) be solved neatly with weighed > KDE, > so I'd like to give this a go. I > > assume that at a minimum, to get basic gaussian_kde.evaluate() > functionality: > > > > > > (1) The covariance calculation would need to be replaced by a weighted- > covariance calculation. (Simple enough.) > > > > > > (2) In evaluate(), the critical part looks like this (and a similar > stanza > that loops over the points instead): > > > # if there are more points than data, so loop over data > > > for i in range(self.n): > > > diff = self.dataset[:, i, newaxis] - points > > > tdiff = dot(self.inv_cov, diff) > > > energy = sum(diff*tdiff,axis=0) / 2.0 > > > result = result + exp(-energy) > > > > > > I assume that, further, the 'diff' values ought to be scaled by the > weights, > too. Is this all that would need > > to be done? (For the integration and resampling, obviously, there would > be a > bit of other work...) > > > > it looks to me that way, scaled according to weight by dataset points > > > > I don't see what the norm_factor should be: > > self._norm_factor = sqrt(linalg.det(2*pi*self.covariance)) * self.n > > there should be the weights somewhere in there, maybe just replace > > self.n by sum(weights) given a constant covariance > > > > sampling doesn't look difficult, if we want biased sampling, then > > instead of randint, we would need weighted randint (non-uniform) > > > > integration might require more work, or not (I never tried to understand > them) > > > > (I don't know if kde in statsmodels has weights on the schedule.) > > > > Josef > > mostly guessing > > > > > > > > Thanks, > > > Zach > > > _______________________________________________ > > > SciPy-User mailing list > > > SciPy-User scipy.org > > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > Hi, > > I am facing the same problem as well, but can't figure out how the > weighting > should be done exactly. > > Has anybody successfully completed the modification of the code to allow a > weighted kde? I am attempting to perform kde on a set of imaging data with > X, Y, > and an additional "temperature" column. > > Performing the kde on only the X,Y axes gives a working heatmap showing the > spatial distribution of the data points, but I would also like to use them > to > see the "temperature" profile (the third axis), much like a geographical > heatmap > showing temperature or rainfall values over a X-Y map. > > I found another set of code from > http://pastebin.com/LNdYCZgw > which allows weighted kde, but when I tried it out with my data, it took > much > longer than the normal kde (>1 hour) when the original code took only a > about > twenty seconds (despite claims that it was faster). > > Thanks, > Jackson > For what it's worth, the code you linked to is much slower for small sample sizes. It's only faster with large numbers (>1e4) of points. It also has a bit of a different use case than gaussian_kde. It's only intended for making a regularly gridded KDE of a very large number of points on a relatively fine grid. It bins the data onto a regular grid and convolves it with an approriate gaussian kernel. This is a reasonable approximation when you're dealing with a large number of points, but not so reasonable if you only have a handful. Because the size of the gaussian kernel can be very large when the sample size is low, the convolution can be very slow for small sample sizes. Also, If I recall correctly, there's a stray flipud that got left in there. You'll want to take it out. (Also, while I think that got posted only a couple of years ago, I wrote it much longer ago than that... There's some less-than-ideal code in there...) However, are you sure that you want a kernel density estimate? What you're describing sounds like interpolation, not a weighted KDE. As an example, a weighted KDE would be used when you wanted to show the density of point estimates while weighting it by error in the location of the point. Instead, it sounds like you have a third variable that you want to make a continuous map of based on irregularly sampled points. If so, have a look at scipy.interpolate (and particularly scipy.interpolate.Rbf). Hope that helps, -Joe > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joferkington at gmail.com Sun Jan 13 11:53:20 2013 From: joferkington at gmail.com (Joe Kington) Date: Sun, 13 Jan 2013 10:53:20 -0600 Subject: [SciPy-User] Weighted KDE In-Reply-To: References: <0CE25948-3ED7-4928-9C63-4F44366C3AF5@yale.edu> Message-ID: On Sun, Jan 13, 2013 at 10:44 AM, Joe Kington wrote: > > > > On Sun, Jan 13, 2013 at 10:08 AM, Jackson Li wrote: > >> gmail.com> writes: >> >> > >> > On Sun, May 13, 2012 at 1:07 PM, Zachary Pincus >> yale.edu> >> wrote: >> > > Hello all, >> > > >> > > A while ago, someone asked on this list about whether it would be >> simple to >> modify >> > scipy.stats.kde.gaussian_kde to deal with weighted data: >> > > http://mail.scipy.org/pipermail/scipy-user/2008-November/018578.html >> > > >> > > Anne and Robert assured the writer that this was pretty simple (modulo >> bandwidth selection), though I >> > couldn't find any code that the original author may have generated >> based on >> that advice. >> > > >> > > I've got a problem that could (perhaps) be solved neatly with weighed >> KDE, >> so I'd like to give this a go. I >> > assume that at a minimum, to get basic gaussian_kde.evaluate() >> functionality: >> > > >> > > (1) The covariance calculation would need to be replaced by a >> weighted- >> covariance calculation. (Simple enough.) >> > > >> > > (2) In evaluate(), the critical part looks like this (and a similar >> stanza >> that loops over the points instead): >> > > # if there are more points than data, so loop over data >> > > for i in range(self.n): >> > > diff = self.dataset[:, i, newaxis] - points >> > > tdiff = dot(self.inv_cov, diff) >> > > energy = sum(diff*tdiff,axis=0) / 2.0 >> > > result = result + exp(-energy) >> > > >> > > I assume that, further, the 'diff' values ought to be scaled by the >> weights, >> too. Is this all that would need >> > to be done? (For the integration and resampling, obviously, there would >> be a >> bit of other work...) >> > >> > it looks to me that way, scaled according to weight by dataset points >> > >> > I don't see what the norm_factor should be: >> > self._norm_factor = sqrt(linalg.det(2*pi*self.covariance)) * >> self.n >> > there should be the weights somewhere in there, maybe just replace >> > self.n by sum(weights) given a constant covariance >> > >> > sampling doesn't look difficult, if we want biased sampling, then >> > instead of randint, we would need weighted randint (non-uniform) >> > >> > integration might require more work, or not (I never tried to >> understand them) >> > >> > (I don't know if kde in statsmodels has weights on the schedule.) >> > >> > Josef >> > mostly guessing >> > >> > > >> > > Thanks, >> > > Zach >> > > _______________________________________________ >> > > SciPy-User mailing list >> > > SciPy-User scipy.org >> > > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> >> Hi, >> >> I am facing the same problem as well, but can't figure out how the >> weighting >> should be done exactly. >> >> Has anybody successfully completed the modification of the code to allow a >> weighted kde? I am attempting to perform kde on a set of imaging data >> with X, Y, >> and an additional "temperature" column. >> >> Performing the kde on only the X,Y axes gives a working heatmap showing >> the >> spatial distribution of the data points, but I would also like to use >> them to >> see the "temperature" profile (the third axis), much like a geographical >> heatmap >> showing temperature or rainfall values over a X-Y map. >> >> I found another set of code from >> http://pastebin.com/LNdYCZgw >> which allows weighted kde, but when I tried it out with my data, it took >> much >> longer than the normal kde (>1 hour) when the original code took only a >> about >> twenty seconds (despite claims that it was faster). >> >> Thanks, >> Jackson >> > > For what it's worth, the code you linked to is much slower for small > sample sizes. It's only faster with large numbers (>1e4) of points. It > also has a bit of a different use case than gaussian_kde. It's only > intended for making a regularly gridded KDE of a very large number of > points on a relatively fine grid. It bins the data onto a regular grid and > convolves it with an approriate gaussian kernel. This is a reasonable > approximation when you're dealing with a large number of points, but not so > reasonable if you only have a handful. Because the size of the gaussian > kernel can be very large when the sample size is low, the convolution can > be very slow for small sample sizes. Also, If I recall correctly, there's > a stray flipud that got left in there. You'll want to take it out. (Also, > while I think that got posted only a couple of years ago, I wrote it much > longer ago than that... There's some less-than-ideal code in there...) > > However, are you sure that you want a kernel density estimate? What > you're describing sounds like interpolation, not a weighted KDE. > > As an example, a weighted KDE would be used when you wanted to show the > density of point estimates while weighting it by error in the location of > the point. > I shouldn't have said "error in the location of the point". I guess it would me more like "confidence that the point exists" or more accurately, "magnitude of the point". Otherwise, the size of the Gaussian kernel would have to change depending on the data involved. As another (not exact) example, it can be handy when you want to sum some attribute over a map to yield a density estimate per-unit-area (e.g. population density, where you have populations of cities as your point measurements). In other words, if you want your temperature values to be summed-per-unit-area, then it's what you want. If you want to interpolate, it's not what you want. > > Instead, it sounds like you have a third variable that you want to make a > continuous map of based on irregularly sampled points. If so, have a look > at scipy.interpolate (and particularly scipy.interpolate.Rbf). > > Hope that helps, > -Joe > > >> >> >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ian.stoic at gmail.com Sun Jan 13 20:34:45 2013 From: ian.stoic at gmail.com (ian stoic) Date: Mon, 14 Jan 2013 02:34:45 +0100 Subject: [SciPy-User] Scipy Cookbook availibility Message-ID: Hello Scipy community I'm looking for reST version of Scipy Cookbook ( http://www.scipy.org/Cookbook) which seems available in wiki and docbook format (and html of course). Besides I'm also looking for a way to download the Cookbook as archive if possible. I'm I right that none of the above is available? Regards, Ian Stoic -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Mon Jan 14 11:08:23 2013 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 14 Jan 2013 11:08:23 -0500 Subject: [SciPy-User] Weighted KDE In-Reply-To: References: <0CE25948-3ED7-4928-9C63-4F44366C3AF5@yale.edu> Message-ID: > I am facing the same problem as well, but can't figure out how the weighting > should be done exactly. > > Has anybody successfully completed the modification of the code to allow a > weighted kde? I am attempting to perform kde on a set of imaging data with X, Y, > and an additional "temperature" column. > > Performing the kde on only the X,Y axes gives a working heatmap showing the > spatial distribution of the data points, but I would also like to use them to > see the "temperature" profile (the third axis), much like a geographical heatmap > showing temperature or rainfall values over a X-Y map. > > I found another set of code from > http://pastebin.com/LNdYCZgw > which allows weighted kde, but when I tried it out with my data, it took much > longer than the normal kde (>1 hour) when the original code took only a about > twenty seconds (despite claims that it was faster). > > Thanks, > Jackson > Here's a modification of the scipy KDE code that I made to perform weighting, as per the earlier discussion. No guarantees as to correctness, but it seems to be right-ish? Zach -------------- next part -------------- A non-text attachment was scrubbed... Name: weighted_kde.py Type: text/x-python-script Size: 4770 bytes Desc: not available URL: From joferkington at gmail.com Sun Jan 13 11:31:26 2013 From: joferkington at gmail.com (Joe Kington) Date: Sun, 13 Jan 2013 10:31:26 -0600 Subject: [SciPy-User] Weighted KDE In-Reply-To: References: <0CE25948-3ED7-4928-9C63-4F44366C3AF5@yale.edu> Message-ID: For what it's worth, the code you linked to is much slower for small sample sizes. It's only faster with large numbers (>1e4) of points. It also has a bit of a different use case than gaussian_kde. It's only intended for making a regularly gridded KDE of a very large number of points on a relatively fine grid. It bins the data onto a regular grid and convolves it with an approriate gaussian kernel. This is a reasonable approximation when you're dealing with a large number of points, but not so reasonable if you only have a handful. Because the size of the gaussian kernel can be very large when the sample size is low, the convolution can be very slow for small sample sizes. Also, If I recall correctly, there's a stray flipud that got left in there. You'll want to take it out. However, are you sure that you want a kernel density estimate? What you're describing sounds like interpolation, not a weighted KDE. As an example, a weighted KDE would be used when you wanted to show the density of point estimates while weighting it by error in the location of the point. Instead, it sounds like you have a third variable that you want to make a continuous map of based on irregularly sampled points. If so, have a look at scipy.interpolate (and particularly scipy.interpolate.Rbf). Hope that helps, -Joe On Sun, Jan 13, 2013 at 10:08 AM, Jackson Li wrote: > gmail.com> writes: > > > > > On Sun, May 13, 2012 at 1:07 PM, Zachary Pincus > yale.edu> > wrote: > > > Hello all, > > > > > > A while ago, someone asked on this list about whether it would be > simple to > modify > > scipy.stats.kde.gaussian_kde to deal with weighted data: > > > http://mail.scipy.org/pipermail/scipy-user/2008-November/018578.html > > > > > > Anne and Robert assured the writer that this was pretty simple (modulo > bandwidth selection), though I > > couldn't find any code that the original author may have generated based > on > that advice. > > > > > > I've got a problem that could (perhaps) be solved neatly with weighed > KDE, > so I'd like to give this a go. I > > assume that at a minimum, to get basic gaussian_kde.evaluate() > functionality: > > > > > > (1) The covariance calculation would need to be replaced by a weighted- > covariance calculation. (Simple enough.) > > > > > > (2) In evaluate(), the critical part looks like this (and a similar > stanza > that loops over the points instead): > > > # if there are more points than data, so loop over data > > > for i in range(self.n): > > > diff = self.dataset[:, i, newaxis] - points > > > tdiff = dot(self.inv_cov, diff) > > > energy = sum(diff*tdiff,axis=0) / 2.0 > > > result = result + exp(-energy) > > > > > > I assume that, further, the 'diff' values ought to be scaled by the > weights, > too. Is this all that would need > > to be done? (For the integration and resampling, obviously, there would > be a > bit of other work...) > > > > it looks to me that way, scaled according to weight by dataset points > > > > I don't see what the norm_factor should be: > > self._norm_factor = sqrt(linalg.det(2*pi*self.covariance)) * self.n > > there should be the weights somewhere in there, maybe just replace > > self.n by sum(weights) given a constant covariance > > > > sampling doesn't look difficult, if we want biased sampling, then > > instead of randint, we would need weighted randint (non-uniform) > > > > integration might require more work, or not (I never tried to understand > them) > > > > (I don't know if kde in statsmodels has weights on the schedule.) > > > > Josef > > mostly guessing > > > > > > > > Thanks, > > > Zach > > > _______________________________________________ > > > SciPy-User mailing list > > > SciPy-User scipy.org > > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > Hi, > > I am facing the same problem as well, but can't figure out how the > weighting > should be done exactly. > > Has anybody successfully completed the modification of the code to allow a > weighted kde? I am attempting to perform kde on a set of imaging data with > X, Y, > and an additional "temperature" column. > > Performing the kde on only the X,Y axes gives a working heatmap showing the > spatial distribution of the data points, but I would also like to use them > to > see the "temperature" profile (the third axis), much like a geographical > heatmap > showing temperature or rainfall values over a X-Y map. > > I found another set of code from > http://pastebin.com/LNdYCZgw > which allows weighted kde, but when I tried it out with my data, it took > much > longer than the normal kde (>1 hour) when the original code took only a > about > twenty seconds (despite claims that it was faster). > > Thanks, > Jackson > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sonicboomed at yahoo.com Sun Jan 13 12:53:25 2013 From: sonicboomed at yahoo.com (Jackson Li) Date: Sun, 13 Jan 2013 09:53:25 -0800 (PST) Subject: [SciPy-User] Weighted KDE Message-ID: <1358099605.31874.YahooMailNeo@web142401.mail.bf1.yahoo.com> On Sun, Jan 13, 2013 at 10:44 AM, Joe Kington wrote: >For what it's worth, the code you linked to is much slower for small >sample sizes. It's only faster with large numbers (>1e4) of points. It >also has a bit of a different use case than gaussian_kde. It's only >intended for making a regularly gridded KDE of a very large number of >points on a relatively fine grid. It bins the data onto a regular grid and >convolves it with an approriate gaussian kernel. This is a reasonable >approximation when you're dealing with a large number of points, but not so >reasonable if you only have a handful. Because the size of the gaussian >kernel can be very large when the sample size is low, the convolution can >be very slow for small sample sizes. Also, If I recall correctly, there's >a stray flipud that got left in there. You'll want to take it out. (Also, >while I think that got posted only a couple of years ago, I wrote it much >longer ago than that... There's some less-than-ideal code in there...) >>However, are you sure that you want a kernel density estimate? What >you're describing sounds like interpolation, not a weighted KDE. >>As an example, a weighted KDE would be used when you wanted to show the >density of point estimates while weighting it by error in the location of >the point. > >>I shouldn't have said "error in the location of the point". I guess it >>would me more like "confidence that the point exists" or more accurately, >>"magnitude of the point". Otherwise, the size of the Gaussian kernel would >>have to change depending on the data involved. >>As another (not exact) example, it can be handy when you want to sum some >>attribute over a map to yield a density estimate per-unit-area (e.g. >>population density, where you have populations of cities as your point >>measurements). In other words, if you want your temperature values to be >>summed-per-unit-area, then it's what you want. If you want to interpolate, >>it's not what you want. >>Instead, it sounds like you have a third variable that you want to make a >continuous map of based on irregularly sampled points. If so, have a look >at scipy.interpolate (and particularly scipy.interpolate.Rbf). >>Hope that helps, >-Joe Hi, Thanks for the quick reply. What you described for the population of cities is indeed what I want. I have several data points spread out randomly in XY space, and each data point has an independent third variable. (e.g. for 2 points very close to each other, one 50 and another 10, and all other data points are far away.? --> I would like that patch to get a value of 30 (average)) Hence, I would like to obtain a XY graph showing the density estimate of the third variable.? (if that patch is mostly high temperature on average, it should be "red", and if it is empty or has a lot of low temperature data points, then it should be "blue".) Thank you! Jackson -------------- next part -------------- An HTML attachment was scrubbed... URL: From subhabangalore at gmail.com Mon Jan 14 08:38:47 2013 From: subhabangalore at gmail.com (Subhabrata Banerjee) Date: Mon, 14 Jan 2013 05:38:47 -0800 (PST) Subject: [SciPy-User] Computing Eigenvalue of Laplacian Matrix Message-ID: <7f275d2e-8afb-4004-96d6-8cf2e23dd1d6@googlegroups.com> Dear Group, I like to compute the eigenvalue of Laplacian matrix and then like to view it as graph. If any one of the learned members of the group may kindly suggest me the procedure. Thanking in advance, Regards, Subhabrata. -------------- next part -------------- An HTML attachment was scrubbed... URL: From terribleangel at gmail.com Mon Jan 14 12:31:45 2013 From: terribleangel at gmail.com (Will) Date: Mon, 14 Jan 2013 17:31:45 +0000 (UTC) Subject: [SciPy-User] Fwd: scipy.test() fails for 0.11.0 on OS X 10.8.2 References: Message-ID: Ralf Gommers gmail.com> writes: > > A little. IIRC these binaries (compiled on OS X 10.6) worked fine > on 10.7, so something changed in again for 10.8. We really > should stop using Accelerate for the official binaries. > > Ralf > Just a follow up in case anyone else is interested: The scipy homebrew formula available from Samuel John (https://github.com/samueljohn/homebrew-python) has a --with-openblas option that builds numpy and scipy using netlib LAPACK and OpenBLAS (http://xianyi.github.com/OpenBLAS/). By this method, I was able to get scipy 0.11.0 installed in OS X 10.8.2 and have it run its tests with no failures. Will From alejandro.weinstein at gmail.com Mon Jan 14 12:48:52 2013 From: alejandro.weinstein at gmail.com (Alejandro Weinstein) Date: Mon, 14 Jan 2013 10:48:52 -0700 Subject: [SciPy-User] Computing Eigenvalue of Laplacian Matrix In-Reply-To: <7f275d2e-8afb-4004-96d6-8cf2e23dd1d6@googlegroups.com> References: <7f275d2e-8afb-4004-96d6-8cf2e23dd1d6@googlegroups.com> Message-ID: On Mon, Jan 14, 2013 at 6:38 AM, Subhabrata Banerjee wrote: > I like to compute the eigenvalue of Laplacian matrix and then like to view > it as graph. import matplotlib.pyplot as plt import numpy as np from numpy.linalg import eig # Assuming the laplacian is in matrix L evalues, evec = eig(L) plt.plot(np.real(evalues)) Alejandro From joferkington at gmail.com Mon Jan 14 14:58:08 2013 From: joferkington at gmail.com (Joe Kington) Date: Mon, 14 Jan 2013 13:58:08 -0600 Subject: [SciPy-User] Weighted KDE In-Reply-To: <1358099605.31874.YahooMailNeo@web142401.mail.bf1.yahoo.com> References: <1358099605.31874.YahooMailNeo@web142401.mail.bf1.yahoo.com> Message-ID: On Jan 14, 2013 11:31 AM, "Jackson Li" wrote: > > On Sun, Jan 13, 2013 at 10:44 AM, Joe Kington wrote: > > > For what it's worth, the code you linked to is much slower for small > > sample sizes. It's only faster with large numbers (>1e4) of points. It > > also has a bit of a different use case than gaussian_kde. It's only > > intended for making a regularly gridded KDE of a very large number of > > points on a relatively fine grid. It bins the data onto a regular grid and > > convolves it with an approriate gaussian kernel. This is a reasonable > > approximation when you're dealing with a large number of points, but not so > > reasonable if you only have a handful. Because the size of the gaussian > > kernel can be very large when the sample size is low, the convolution can > > be very slow for small sample sizes. Also, If I recall correctly, there's > > a stray flipud that got left in there. You'll want to take it out. (Also, > > while I think that got posted only a couple of years ago, I wrote it much > > longer ago than that... There's some less-than-ideal code in there...) > > > > However, are you sure that you want a kernel density estimate? What > > you're describing sounds like interpolation, not a weighted KDE. > > > > As an example, a weighted KDE would be used when you wanted to show the > > density of point estimates while weighting it by error in the location of > > the point. > > > > >>I shouldn't have said "error in the location of the point". I guess it > >>would me more like "confidence that the point exists" or more accurately, > >>"magnitude of the point". Otherwise, the size of the Gaussian kernel would > >>have to change depending on the data involved. > > >>As another (not exact) example, it can be handy when you want to sum some > >>attribute over a map to yield a density estimate per-unit-area (e.g. > >>population density, where you have populations of cities as your point > >>measurements). In other words, if you want your temperature values to be > >>summed-per-unit-area, then it's what you want. If you want to interpolate, > >>it's not what you want. > > > > > > Instead, it sounds like you have a third variable that you want to make a > > continuous map of based on irregularly sampled points. If so, have a look > > at scipy.interpolate (and particularly scipy.interpolate.Rbf). > > > > Hope that helps, > > -Joe > > > Hi, > > Thanks for the quick reply. > > What you described for the population of cities is indeed what I want. > > I have several data points spread out randomly in XY space, and each data point has an independent third variable. > > (e.g. for 2 points very close to each other, one 50 and another 10, and all other data points are far away. You're describing interpolation, for whatever it's worth. You want to interpolate your "z" values, not determine the number of samples you have per unit area. A KDE will give you "bulls eyes" around where you have data and the resulting values won't directly reflect the weight values you pass in. Instead, the values will mostly reflect where you have clusters of point measurements, modified by the localized sum of the weights. The exact value you get will depend on the covariance of your sampled point distribution. Instead, you want a smooth surface that reflects your sampled z values. Have a look at some of the examples involving scipy.interpolate.griddata or scipy.interpolate.Rbf. The cookbook is a bit out of date, but take a look at the second example on this page: http://www.scipy.org/Cookbook/RadialBasisFunctions Hope that helps! -Joe > > --> I would like that patch to get a value of 30 (average)) > > > Hence, I would like to obtain a XY graph showing the density estimate of the third variable. > > (if that patch is mostly high temperature on average, it should be "red", and if it is empty or has a lot of low temperature data points, then it should be "blue".) > > > Thank > you! > > Jackson > > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hannah.lona at gmail.com Mon Jan 14 23:50:24 2013 From: hannah.lona at gmail.com (hannahlona) Date: Mon, 14 Jan 2013 20:50:24 -0800 (PST) Subject: [SciPy-User] Parallel Differential Evolution In-Reply-To: References: Message-ID: <1358225424974-17671.post@n7.nabble.com> I am trying to update the DE code from the following email with Andrea's suggestions: / (1). numpy.zeros(X) instead of flex.double(X, 0) (2). 1000*numpy.ones(X) instead of flex.double(X, 1000) (3). numpy.min(X) for flex.min(X), etc. (mean, sum) (4). numpy.random.uniform(size=N) for flex.random_double(N) However, to get this work, I also had to modify the following: - modification (4). only works when floats are meant to be used, I guess, so in the only case: rnd = numpy.random.uniform(size=self.vector_length) instead of flex.random_double(N). In the other two cases it is random_values = numpy.random.random_integers(low=0.0, high=1.0, size=N) instead of flex.random_double(N) - Also, numpy.nanargmin instead of flex.min_index - Also, numpy.argsort instead of flex.sort_permutation - Also, .copy() instead of .deep_copy() - Finally, numpy.random.seed(0) instead of flex.set_random_seed(0) / but I am getting the following error when I try to run the code: Traceback (most recent call last): File "C:...\StornDEcode.py", line 260, in run() File "C:...\StornDEcode.py", line 255, in run test_rosenbrock_function(1) File "C:\...\StornDEcode.py", line 232, in __init__ self.optimizer = differential_evolution_optimizer(self,population_size=min(self.n*10,40),n_cross=self.n,cr=0.9, eps=1e-8, show_progress=True) File "C:\...\StornDEcode.py", line 95, in __init__ self.optimize() File "C:\...\StornDEcode.py", line 116, in optimize self.evolve() File "C:\...\StornDEcode.py", line 166, in evolve i1=permut[0] IndexError: invalid index to scalar variable. The original code is here: http://cci.lbl.gov/cctbx_sources/scitbx/differential_evolution.py And I have uploaded my version of it. I am new to the forum, and to python, and would appreciate any help anyone has to offer! StornDEcode.py Hannah -- View this message in context: http://scipy-user.10969.n7.nabble.com/Parallel-Differential-Evolution-tp12097p17671.html Sent from the Scipy-User mailing list archive at Nabble.com. From lists at hilboll.de Tue Jan 15 04:46:46 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Tue, 15 Jan 2013 10:46:46 +0100 Subject: [SciPy-User] griddata() performance Message-ID: <50F52586.20005@hilboll.de> Hi, I'm wondering which performance I can expect from griddata. I will need to interpolate from 4d unstructured data, with dimensions at least 5x17x38x52, and I need to get the 1d-values of ~ 15k points. So far, the routine is running for > 30 minutes, and I'm wondering if this is to be expected. My machine is a AMD Opteron(tm) Processor 8439 SE 2.8GHz CPU. I guess there's no easy way to parallelize this? Cheers, Andreas. From pav at iki.fi Tue Jan 15 05:43:32 2013 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 15 Jan 2013 10:43:32 +0000 (UTC) Subject: [SciPy-User] griddata() performance References: <50F52586.20005@hilboll.de> Message-ID: Andreas Hilboll hilboll.de> writes: > I'm wondering which performance I can expect from griddata. I will need > to interpolate from 4d unstructured data, with dimensions at least > 5x17x38x52, and I need to get the 1d-values of ~ 15k points. So far, the > routine is running for > 30 minutes, and I'm wondering if this is to be > expected. My machine is a AMD Opteron(tm) Processor 8439 SE 2.8GHz CPU. > I guess there's no easy way to parallelize this? The step taking the time is Delaunay triangulation of your data point set. In 4D, the number of simplices is probably huge, and computing them takes time. This is done by the Qhull library, and probably nontrivial (or impossible) to parallelize. The algorithm is probably not far from state-of-the-art, so I don't think it is possible to speed this up significantly in 4D. However, doesn't the fact that that you say "5x17x38x52" mean that your data has structure that you can exploit? (A grid with non-uniform spacing in each dimension?) If so, using griddata() is not the best approach. One answer for rectangular grid are tensor product splines. Scipy doesn't have implementation of these though (yet), but it's possible to roll up your own. Alternatively, if you data is really unstructured something like RBF or inverse distance weighing may work (more or less badly, depends on what you want). -- Pauli Virtanen From rik at cogsci.ucsd.edu Mon Jan 14 20:13:48 2013 From: rik at cogsci.ucsd.edu (RKBelew) Date: Mon, 14 Jan 2013 17:13:48 -0800 Subject: [SciPy-User] capturing stdout, stderr from integrate.odeint() ? Message-ID: <50F4AD4C.90300@cogsci.ucsd.edu> i'm trying to capture the various warnings being generated by lsoda from the FORTRAN library odepack, ala > lsoda-- warning..internal t (=r1) and h (=r2) are > such that in the machine, t + h = t on the next step > (h = step size). solver will continue anyway > In above, R1 = 0.2867368005697E+02 R2 = 0.1738483517440E-14 my strategy has been to just redirect sys.stdout and/or sys.stderr to a StringIO() buffer and then explore it. but that isn't working: i still get the warnings, and my buffer remains empty? can someone point me to where in the guts of scipy.integrate this might be resolved, and/or why my approach isn't working? thanks for your help. rik From silva at lma.cnrs-mrs.fr Tue Jan 15 09:35:15 2013 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Tue, 15 Jan 2013 15:35:15 +0100 Subject: [SciPy-User] capturing stdout, stderr from integrate.odeint() ? In-Reply-To: <50F4AD4C.90300@cogsci.ucsd.edu> References: <50F4AD4C.90300@cogsci.ucsd.edu> Message-ID: <1358260515.17516.3.camel@laptop-101> Le lundi 14 janvier 2013 ? 17:13 -0800, RKBelew a ?crit : > i'm trying to capture the various warnings being generated > by lsoda from the FORTRAN library odepack, ala > > > lsoda-- warning..internal t (=r1) and h (=r2) are > > such that in the machine, t + h = t on the next step > > (h = step size). solver will continue anyway > > In above, R1 = 0.2867368005697E+02 R2 = 0.1738483517440E-14 > > my strategy has been to just redirect sys.stdout and/or sys.stderr > to a StringIO() buffer and then explore it. but that isn't working: > i still get the warnings, and my buffer remains empty? > > can someone point me to where in the guts of scipy.integrate > this might be resolved, and/or why my approach isn't working? maybe set the full_output keyword argument to True, and get the second output (infodict) and its 'message' item. From njs at pobox.com Tue Jan 15 11:51:32 2013 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 15 Jan 2013 16:51:32 +0000 Subject: [SciPy-User] capturing stdout, stderr from integrate.odeint() ? In-Reply-To: <50F4AD4C.90300@cogsci.ucsd.edu> References: <50F4AD4C.90300@cogsci.ucsd.edu> Message-ID: On 15 Jan 2013 14:22, "RKBelew" wrote: > > i'm trying to capture the various warnings being generated > by lsoda from the FORTRAN library odepack, ala > > > lsoda-- warning..internal t (=r1) and h (=r2) are > > such that in the machine, t + h = t on the next step > > (h = step size). solver will continue anyway > > In above, R1 = 0.2867368005697E+02 R2 = 0.1738483517440E-14 > > my strategy has been to just redirect sys.stdout and/or sys.stderr > to a StringIO() buffer and then explore it. but that isn't working: > i still get the warnings, and my buffer remains empty? > > can someone point me to where in the guts of scipy.integrate > this might be resolved, and/or why my approach isn't working? sys.std{out,err} are ordinary Python file objects, which normally point to the operating system stdout/stderr (file descriptors 1 and 2). The fortran code is writing directly to the operating system stdout/stderr without going through the Python layer, so redirecting things at the Python layer doesn't help. It's possible to work around this using elaborate Unix tricks -- create an os.pipe(), use os.dup2() to set it up so that writes to file descriptors 1 and 2 will go to your pipe, create some threads to read the other ends of the pipes into a Python buffer of some sort -- but it may not be worth the hassle. The other cumbersome workaround would be to split your odepack-using code out into a separate program, and then use the 'subprocess' module to call it and read its output... -n From klonuo at gmail.com Wed Jan 16 03:23:25 2013 From: klonuo at gmail.com (klo uo) Date: Wed, 16 Jan 2013 09:23:25 +0100 Subject: [SciPy-User] audio related scikits fortune Message-ID: Can anyone provide more information about these scikits: - audiolab (wrapper to libsndfile) - samplerate (wrapper to libsamplerate) - talkbox (couple of speech related functions) They are all by the same author and it seems to me like some rapid event caused their abandonment some 3 years ago. All had future plans and TODOs, and are moderately popular according pypi data. Also I experienced that their packaging is unfortunate and documentation hard to generate. No binary distributions and neither package reads standard library/include folders, but manually editing site.cfg is required. Github issues and PR are frozen. I opened issue on one of packages asking for information, but got no response, so I thought that maybe someone here knows more about it. Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Wed Jan 16 10:20:36 2013 From: cournape at gmail.com (David Cournapeau) Date: Wed, 16 Jan 2013 09:20:36 -0600 Subject: [SciPy-User] audio related scikits fortune In-Reply-To: References: Message-ID: Hi Klo, On Wed, Jan 16, 2013 at 2:23 AM, klo uo wrote: > Can anyone provide more information about these scikits: > > - audiolab (wrapper to libsndfile) > - samplerate (wrapper to libsamplerate) > - talkbox (couple of speech related functions) > > They are all by the same author and it seems to me like some rapid event > caused their abandonment some 3 years ago. All had future plans and TODOs, > and are moderately popular according pypi data. > > Also I experienced that their packaging is unfortunate and documentation > hard to generate. No binary distributions and neither package reads standard > library/include folders, but manually editing site.cfg is required. The rapid event is that I graduated :) The packaging should be pretty standard. There is some complexity to look for the underlying library on mac/windows, but nothing fundamental. > > Github issues and PR are frozen. I opened issue on one of packages asking > for information, but got no response, so I thought that maybe someone here > knows more about it. I don't know why I did not see all those issues/PR, that's not good, I would have expected GH to ping me. I will look into those today for at least audiolab (the most popular). I don't have time to upgrade them, though. I would be happy to 'give the key' to people who are interested in picking it up beyond mere maintenance. cheers, David From horea.christ at gmail.com Wed Jan 16 08:32:19 2013 From: horea.christ at gmail.com (Horea Christian) Date: Wed, 16 Jan 2013 05:32:19 -0800 (PST) Subject: [SciPy-User] transfrom.frozen() failing on account of matplotlib.nxutils Message-ID: <7231aafe-9d70-4aa9-a8e5-17c73e8e3c55@googlegroups.com> Hey there, this question is rather matplotlib-oriented but they don't seem to have a working user mailing list. I am using psychopy ( a python lib depending on scipy, numpy, and matplotlib), and I am having the following issue https://groups.google.com/forum/?fromgroups=#!topic/psychopy-users/1L1U-VZwXZc In short my error message is http://paste2.org/p/2753557 I have been told this is due to how matplotlib.nxutils is working on my system. I am running matplotlib-1.2.0-r1 Could you give me any pointers regarding this? Cheers, -------------- next part -------------- An HTML attachment was scrubbed... URL: From aquil.abdullah at gmail.com Wed Jan 16 20:41:03 2013 From: aquil.abdullah at gmail.com (Aquil H. Abdullah) Date: Wed, 16 Jan 2013 20:41:03 -0500 Subject: [SciPy-User] Installing Scipy 0.11.0 on a MacBook Pro Running Mac OS x 7.5 Message-ID: Hello All, I've recently tried to install SciPy on my Mac Book Pro *Processor* 2.5 GHz Intel Core i7 *Memory* 8 GB 1333 MHz DDR3 *Software* Mac OS X Lion 10.7.5 (11G63b) XCode: Version 4.5.2 (4G2008a) [NOTE: Previously, this was Version: 4.2.1] I've successfully installed NumPy, using the following commands: export CC=gcc-4.2 export CXX=g++-4.2 export FFLAGS=-ff2c git clone https://github.com/scipy/scipy.git cd scipy python setup.py build sudo setup.py install No matter what options I try, I cannot get Scipy to compile the FAT binaries that contain both the i386 image and the x86_64 image. So when I try to import scipy.stats, I get the following error: I tried the same commands on another computer in my office and everything built fine, and I am able to run scipy. If I look at the dynamic libraries in that version of scipy I see the following: [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import scipy.stats Traceback (most recent call last): File "", line 1, in File "/Library/Python/2.7/site-packages/scipy/stats/__init__.py", line 321, in from stats import * File "/Library/Python/2.7/site-packages/scipy/stats/stats.py", line 193, in import scipy.special as special File "/Library/Python/2.7/site-packages/scipy/special/__init__.py", line 525, in from _cephes import * ImportError: dlopen(/Library/Python/2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Python/2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture Any suggestions? OTHER DETAILS *GCC VERSION* > gcc-4.2 --version i686-apple-darwin11-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3) Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. *G++ VERSION* > g++-4.2 --version i686-apple-darwin11-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3) Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. *GFORTRAN VERSION* > gfortran --version GNU Fortran (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3) Copyright (C) 2007 Free Software Foundation, Inc. -- Aquil H. Abdullah aquil.abdullah at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jean-Paul.JADAUD at CEA.FR Thu Jan 17 02:46:20 2013 From: Jean-Paul.JADAUD at CEA.FR (Jean-Paul.JADAUD at CEA.FR) Date: Thu, 17 Jan 2013 08:46:20 +0100 Subject: [SciPy-User] transfrom.frozen() failing on account ofmatplotlib.nxutils In-Reply-To: <7231aafe-9d70-4aa9-a8e5-17c73e8e3c55@googlegroups.com> References: <7231aafe-9d70-4aa9-a8e5-17c73e8e3c55@googlegroups.com> Message-ID: <6BE3FB83A53E5E4D9A599CC05BC175651BDAD9@U-MSGDAM.dif.dam.intra.cea.fr> You should have a look on matplotlib documentation : you may have trouble with deprecated features if you are using matplotlib 1.2 See http://matplotlib.org/api/nxutils_api.html?highlight=nxutils#matplotlib.nxutils.pnpoly There is a working matplotlib mailing list? Cheers, JP Jadaud De : scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] De la part de Horea Christian Envoy? : mercredi 16 janvier 2013 14:32 ? : scipy-user at googlegroups.com Objet : [SciPy-User] transfrom.frozen() failing on account ofmatplotlib.nxutils Hey there, this question is rather matplotlib-oriented but they don't seem to have a working user mailing list. I am using psychopy ( a python lib depending on scipy, numpy, and matplotlib), and I am having the following issue https://groups.google.com/forum/?fromgroups=#!topic/psychopy-users/1L1U-VZwXZc In short my error message is http://paste2.org/p/2753557 I have been told this is due to how matplotlib.nxutils is working on my system. I am running matplotlib-1.2.0-r1 Could you give me any pointers regarding this? Cheers, -------------- next part -------------- An HTML attachment was scrubbed... URL: From burger.ga at gmail.com Thu Jan 17 04:33:12 2013 From: burger.ga at gmail.com (Gerhard Burger) Date: Thu, 17 Jan 2013 10:33:12 +0100 Subject: [SciPy-User] numpy test fails with "Illegal instruction' Message-ID: Dear numpy/scipy users, I am trying to get numpy to work on my computer, but so far no luck. When I run `numpy.test(verbose=10)` it crashes with test_polyfit (test_polynomial.TestDocs) ... Illegal instruction In the FAQ it states that I should provide the following information (running Ubuntu 12.04 64bit): os.name = 'posix' uname -r = 3.2.0-35-generic sys.platform = 'linux2' sys.version = '2.7.3 (default, Aug 1 2012, 05:14:39) \n[GCC 4.6.3]' Atlas is not installed (not required for numpy, only for scipy right?) It fails both when I install numpy 1.6.2 with `pip install numpy` and if I install the latest dev version from git. Can someone give me some pointers on how to solve this? I will be grateful for any help you can provide. Kind regards, Gerhard Burger -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Fri Jan 18 03:18:27 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 18 Jan 2013 09:18:27 +0100 Subject: [SciPy-User] curve_fit with float32 values In-Reply-To: References: Message-ID: Am 12.01.2013 15:58, schrieb Pauli Virtanen: > 12.01.2013 14:54, Georg Brandl kirjoitti: > [clip] >> In the end (which took quite a while!) we found that the problem >> was that the X data (which was directly loaded from a HDF file) >> had a float32 dtype. This seems to confuse curve_fit. Same goes >> for float16. float128 at least raises an exception. Integer types >> seem fine given rounding, see the code/output below. > > AFAIK, a likely bug here is probably the choice of epsilon for numerical > differentiation in leastsq --- the chosen epsilon is probably smaller > than the machine epsilon for float32, hence problems appear. I had a look at the code now; minpack_lmdif calls ap_x = (PyArrayObject *)PyArray_ContiguousFromObject(x0, NPY_DOUBLE, 1, 1); IIUC this should already result in a float64 array, at least on machines where double is 64-bits? Georg From sebastian at sipsolutions.net Fri Jan 18 04:53:24 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 18 Jan 2013 10:53:24 +0100 Subject: [SciPy-User] curve_fit with float32 values In-Reply-To: References: Message-ID: <1358502804.2453.53.camel@sebastian-laptop> On Fri, 2013-01-18 at 09:18 +0100, Georg Brandl wrote: > Am 12.01.2013 15:58, schrieb Pauli Virtanen: > > 12.01.2013 14:54, Georg Brandl kirjoitti: > > [clip] > >> In the end (which took quite a while!) we found that the problem > >> was that the X data (which was directly loaded from a HDF file) > >> had a float32 dtype. This seems to confuse curve_fit. Same goes > >> for float16. float128 at least raises an exception. Integer types > >> seem fine given rounding, see the code/output below. > > > > AFAIK, a likely bug here is probably the choice of epsilon for numerical > > differentiation in leastsq --- the chosen epsilon is probably smaller > > than the machine epsilon for float32, hence problems appear. > > I had a look at the code now; minpack_lmdif calls > > ap_x = (PyArrayObject *)PyArray_ContiguousFromObject(x0, NPY_DOUBLE, 1, 1); > > IIUC this should already result in a float64 array, at least on machines > where double is 64-bits? > It may have been fixed, but I think the problem may be only about the temporaries within the function evaluation. You do not and maybe even cannot know if `args` contains a float32 array (you could check that the function return value is double, but even that may not tell the whole story). Sebastian > Georg > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From sebastian at sipsolutions.net Fri Jan 18 05:36:42 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 18 Jan 2013 11:36:42 +0100 Subject: [SciPy-User] curve_fit with float32 values In-Reply-To: <1358502804.2453.53.camel@sebastian-laptop> References: <1358502804.2453.53.camel@sebastian-laptop> Message-ID: <1358505402.2453.55.camel@sebastian-laptop> On Fri, 2013-01-18 at 10:53 +0100, Sebastian Berg wrote: > On Fri, 2013-01-18 at 09:18 +0100, Georg Brandl wrote: > > Am 12.01.2013 15:58, schrieb Pauli Virtanen: > > > 12.01.2013 14:54, Georg Brandl kirjoitti: > > > [clip] > > >> In the end (which took quite a while!) we found that the problem > > >> was that the X data (which was directly loaded from a HDF file) > > >> had a float32 dtype. This seems to confuse curve_fit. Same goes > > >> for float16. float128 at least raises an exception. Integer types > > >> seem fine given rounding, see the code/output below. > > > > > > AFAIK, a likely bug here is probably the choice of epsilon for numerical > > > differentiation in leastsq --- the chosen epsilon is probably smaller > > > than the machine epsilon for float32, hence problems appear. > > > > I had a look at the code now; minpack_lmdif calls > > > > ap_x = (PyArrayObject *)PyArray_ContiguousFromObject(x0, NPY_DOUBLE, 1, 1); > > > > IIUC this should already result in a float64 array, at least on machines > > where double is 64-bits? > > > > It may have been fixed, but I think the problem may be only about the > temporaries within the function evaluation. You do not and maybe even > cannot know if `args` contains a float32 array (you could check that the > function return value is double, but even that may not tell the whole > story). > To be exact, the thing is x0 gets typically unpacked in the evaluation function. Which means that x0 is a set of scalars that have no effect on the casting. You *could* hack around that... but... > Sebastian > > > Georg > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ralf.gommers at gmail.com Fri Jan 18 19:41:29 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 19 Jan 2013 01:41:29 +0100 Subject: [SciPy-User] Installing Scipy 0.11.0 on a MacBook Pro Running Mac OS x 7.5 In-Reply-To: References: Message-ID: On Thu, Jan 17, 2013 at 2:41 AM, Aquil H. Abdullah wrote: > Hello All, > > I've recently tried to install SciPy on my Mac Book Pro > > *Processor* 2.5 GHz Intel Core i7 > > *Memory* 8 GB 1333 MHz DDR3 > > *Software* Mac OS X Lion 10.7.5 (11G63b) > > XCode: Version 4.5.2 (4G2008a) [NOTE: Previously, this was Version: 4.2.1] > > > I've successfully installed NumPy, using the following commands: > > export CC=gcc-4.2 > > export CXX=g++-4.2 > > export FFLAGS=-ff2c > > git clone https://github.com/scipy/scipy.git > > cd scipy > > python setup.py build > > sudo setup.py install > > No matter what options I try, I cannot get Scipy to compile the FAT > binaries that contain both the i386 image and the x86_64 image. So when I > try to import scipy.stats, I get the following error: > You should get i386 and x86_64 by default. Can you put up the full build log somewhere? Ralf > I tried the same commands on another computer in my office and everything > built fine, and I am able to run scipy. If I look at the dynamic libraries > in that version of scipy I see the following: > > [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on > darwin > > Type "help", "copyright", "credits" or "license" for more information. > > >>> import scipy.stats > > Traceback (most recent call last): > > File "", line 1, in > > File "/Library/Python/2.7/site-packages/scipy/stats/__init__.py", line > 321, in > > from stats import * > > File "/Library/Python/2.7/site-packages/scipy/stats/stats.py", line 193, > in > > import scipy.special as special > > File "/Library/Python/2.7/site-packages/scipy/special/__init__.py", line > 525, in > > from _cephes import * > > ImportError: > dlopen(/Library/Python/2.7/site-packages/scipy/special/_cephes.so, 2): no > suitable image found. Did find: > > /Library/Python/2.7/site-packages/scipy/special/_cephes.so: mach-o, but > wrong architecture > > > Any suggestions? > > > OTHER DETAILS > > *GCC VERSION* > > > gcc-4.2 --version > > i686-apple-darwin11-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3) > > Copyright (C) 2007 Free Software Foundation, Inc. > > This is free software; see the source for copying conditions. There is NO > > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > > *G++ VERSION* > > > g++-4.2 --version > > i686-apple-darwin11-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3) > > Copyright (C) 2007 Free Software Foundation, Inc. > > This is free software; see the source for copying conditions. There is NO > > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > > *GFORTRAN VERSION* > > > gfortran --version > > GNU Fortran (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3) > > Copyright (C) 2007 Free Software Foundation, Inc. > > > -- > Aquil H. Abdullah > aquil.abdullah at gmail.com > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Sam.Cable at kirtland.af.mil Fri Jan 18 19:54:49 2013 From: Sam.Cable at kirtland.af.mil (Cable, Sam B Civ USAF AFMC AFRL/RVBXI) Date: Fri, 18 Jan 2013 17:54:49 -0700 Subject: [SciPy-User] scipy install: atlas incomplete Message-ID: <39B5ED61E7BFC24FA8277B6DE92A9A3F04FB8BC2@fkimlki01.enterprise.afmc.ds.af.mil> I am trying to install scipy, and encountering problems. Comparing my results to the FAQ page, it looks like my ATLAS is incomplete. The page says: LAPACK library provided by ATLAS is incomplete You will notice it when getting import errors like ImportError: .../flapack.so : undefined symbol: sgesdd_ To be sure that NumPy/SciPy is built against a complete LAPACK, check the size of the file liblapack.a - it should be about 6MB. The location of liblapack.a is shown by executing python numpy/distutils/system_info.py lapack To fix: follow the instructions in Building a complete LAPACK library to create a complete liblapack.a. Then copy liblapack.a to the same location where libatlas.a is installed and retry with scipy build. The actual object I am missing is "sgges_". I have tried two solutions. 1) I have followed the ATLAS instructions for making a complete build of LAPACK, rev. 3.4.2. (BTW, the resulting LAPACK is about 10MB, bigger than the 6MB in the FAQ.) 2) I have found a pre-compiled binary for LAPACK - rev. unclear -- and just downloaded it and dropped it in place. (It is close to the 6MB in size.) I get the same problem regardless. "nm" shows sgges_ defined in liblapack.a plain as day. System_info.py finds my lapack just fine in /usr/local/lib. Is this a critical failure? Is there anything else to do? BTW, I am running python 2.7 on a 64 bit CentOs 5.x machine and gfortran is my FORTRAN compiler. Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5664 bytes Desc: not available URL: From francescoboccacci at libero.it Sat Jan 19 06:34:49 2013 From: francescoboccacci at libero.it (francescoboccacci at libero.it) Date: Sat, 19 Jan 2013 12:34:49 +0100 (CET) Subject: [SciPy-User] Epanechnikov kernel Message-ID: <6786840.3047071358595289503.JavaMail.defaultUser@defaultHost> Hi all,I have a question for you. Is it possible in scipy using a Epanechnikov kernel function?I checked on scipy documentation but i found that the only way to calculate kernel-density estimate is possible only with using Gaussian kernels?Is it true? Can you help me? Thanks Francesco -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Jan 19 07:49:58 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 19 Jan 2013 07:49:58 -0500 Subject: [SciPy-User] Epanechnikov kernel In-Reply-To: <6786840.3047071358595289503.JavaMail.defaultUser@defaultHost> References: <6786840.3047071358595289503.JavaMail.defaultUser@defaultHost> Message-ID: On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it wrote: > Hi all, > > I have a question for you. Is it possible in scipy using a Epanechnikov > kernel function? > > I checked on scipy documentation but i found that the only way to calculate > kernel-density estimate is possible only with using Gaussian kernels? > > Is it true? Yes, kde in scipy.stats only has gaussian_kde Also in statsmodels currently only gaussian is supported for continuous data http://statsmodels.sourceforge.net/devel/nonparametric.html (It was removed because in the references only the bandwidth selection made much difference in the estimation, but not the shape of the kernel. Other kernels for continuous variables will come back eventually. There is still some old code in the sandbox for generic kernels https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/nonparametric/kernels.py No idea about the status. On the other hand we do have automatic bandwidth selection, and kernels for categorical and ordered variables.) astroML http://astroml.github.com/modules/generated/astroML.density_estimation.KDE.html#astroML.density_estimation.KDE has some other kernels, but not Epanechnikov. Josef > > Can you help me? > > > Thanks > > > Francesco > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From nhv at cape.com Sat Jan 19 08:18:03 2013 From: nhv at cape.com (Norman Vine) Date: Sat, 19 Jan 2013 08:18:03 -0500 Subject: [SciPy-User] Epanechnikov kernel In-Reply-To: References: <6786840.3047071358595289503.JavaMail.defaultUser@defaultHost> Message-ID: for a pdf function that can use most of the standard kernels see http://bonsai.hgc.jp/~mdehoon/software/python/Statistics/ "y, x = pdf(data, weight = None, h = None, kernel = 'Epanechnikov', n = 100)\n" "or\n" "y = pdf(data, x, weight = None, h = None, kernel = 'Epanechnikov')\n" "\n" "This function estimates the probability density function from the random\n" "numbers in the array data, using the bandwidth h and the specified kernel\n" "function.\n" ???.. "o) The keyword argument 'kernel' specifies the kernel function:\n" " -'E' or 'Epanechnikov' : Epanechnikov kernel (default)\n" " -'U' or 'Uniform' : Uniform kernel\n" " -'T' or 'Triangle' : Triangle kernel\n" " -'G' or 'Gaussian' : Gaussian kernel\n" " -'B' or 'Biweight' : Quartic/biweight kernel\n" " -'3' or 'Triweight' : Triweight kernel\n" " -'C' or 'Cosine' : Cosine kernel\n" On Jan 19, 2013, at 7:49 AM, josef.pktd at gmail.com wrote: > On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it > wrote: >> Hi all, >> >> I have a question for you. Is it possible in scipy using a Epanechnikov >> kernel function? >> >> I checked on scipy documentation but i found that the only way to calculate >> kernel-density estimate is possible only with using Gaussian kernels? >> >> Is it true? > > Yes, kde in scipy.stats only has gaussian_kde > > Also in statsmodels currently only gaussian is supported for > continuous data > http://statsmodels.sourceforge.net/devel/nonparametric.html > (It was removed because in the references only the bandwidth selection > made much difference in the estimation, but not the shape of the > kernel. Other kernels for continuous variables will come back > eventually. > There is still some old code in the sandbox for generic kernels > https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/nonparametric/kernels.py > No idea about the status. > On the other hand we do have automatic bandwidth selection, and > kernels for categorical and ordered variables.) > > astroML http://astroml.github.com/modules/generated/astroML.density_estimation.KDE.html#astroML.density_estimation.KDE > has some other kernels, but not Epanechnikov. > > Josef > >> >> Can you help me? >> >> >> Thanks >> >> >> Francesco >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Sat Jan 19 08:32:07 2013 From: jsseabold at gmail.com (Skipper Seabold) Date: Sat, 19 Jan 2013 08:32:07 -0500 Subject: [SciPy-User] Epanechnikov kernel In-Reply-To: References: <6786840.3047071358595289503.JavaMail.defaultUser@defaultHost> Message-ID: On Sat, Jan 19, 2013 at 7:49 AM, wrote: > On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it > wrote: >> Hi all, >> >> I have a question for you. Is it possible in scipy using a Epanechnikov >> kernel function? >> >> I checked on scipy documentation but i found that the only way to calculate >> kernel-density estimate is possible only with using Gaussian kernels? >> >> Is it true? > > Yes, kde in scipy.stats only has gaussian_kde > > Also in statsmodels currently only gaussian is supported for > continuous data > http://statsmodels.sourceforge.net/devel/nonparametric.html > (It was removed because in the references only the bandwidth selection > made much difference in the estimation, but not the shape of the > kernel. Other kernels for continuous variables will come back > eventually. If you're interested in univariate KDE, then we do have the Epanechnikov kernel. http://statsmodels.sourceforge.net/devel/generated/statsmodels.nonparametric.kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate.fit Skipper From francescoboccacci at libero.it Sat Jan 19 08:39:17 2013 From: francescoboccacci at libero.it (francescoboccacci at libero.it) Date: Sat, 19 Jan 2013 14:39:17 +0100 (CET) Subject: [SciPy-User] R: Re: Epanechnikov kernel Message-ID: <30888997.3068041358602757015.JavaMail.defaultUser@defaultHost> Thanks.I will investigate on it. Francesco >----Messaggio originale---- >Da: josef.pktd at gmail.com >Data: 19/01/2013 13.49 >A: "SciPy Users List" >Ogg: Re: [SciPy-User] Epanechnikov kernel > >On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it > wrote: >> Hi all, >> >> I have a question for you. Is it possible in scipy using a Epanechnikov >> kernel function? >> >> I checked on scipy documentation but i found that the only way to calculate >> kernel-density estimate is possible only with using Gaussian kernels? >> >> Is it true? > >Yes, kde in scipy.stats only has gaussian_kde > >Also in statsmodels currently only gaussian is supported for >continuous data >http://statsmodels.sourceforge.net/devel/nonparametric.html >(It was removed because in the references only the bandwidth selection >made much difference in the estimation, but not the shape of the >kernel. Other kernels for continuous variables will come back >eventually. >There is still some old code in the sandbox for generic kernels >https://github. com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/nonparametric/kernels. py >No idea about the status. >On the other hand we do have automatic bandwidth selection, and >kernels for categorical and ordered variables.) > >astroML http://astroml.github.com/modules/generated/astroML. density_estimation.KDE.html#astroML.density_estimation.KDE >has some other kernels, but not Epanechnikov. > >Josef > >> >> Can you help me? >> >> >> Thanks >> >> >> Francesco >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >_______________________________________________ >SciPy-User mailing list >SciPy-User at scipy.org >http://mail.scipy.org/mailman/listinfo/scipy-user > From francescoboccacci at libero.it Sat Jan 19 08:39:58 2013 From: francescoboccacci at libero.it (francescoboccacci at libero.it) Date: Sat, 19 Jan 2013 14:39:58 +0100 (CET) Subject: [SciPy-User] R: Re: Epanechnikov kernel Message-ID: <30841157.3068221358602798612.JavaMail.defaultUser@defaultHost> Thanks Francesco >----Messaggio originale---- >Da: jsseabold at gmail.com >Data: 19/01/2013 14.32 >A: "SciPy Users List" >Ogg: Re: [SciPy-User] Epanechnikov kernel > >On Sat, Jan 19, 2013 at 7:49 AM, wrote: >> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it >> wrote: >>> Hi all, >>> >>> I have a question for you. Is it possible in scipy using a Epanechnikov >>> kernel function? >>> >>> I checked on scipy documentation but i found that the only way to calculate >>> kernel-density estimate is possible only with using Gaussian kernels? >>> >>> Is it true? >> >> Yes, kde in scipy.stats only has gaussian_kde >> >> Also in statsmodels currently only gaussian is supported for >> continuous data >> http://statsmodels.sourceforge.net/devel/nonparametric.html >> (It was removed because in the references only the bandwidth selection >> made much difference in the estimation, but not the shape of the >> kernel. Other kernels for continuous variables will come back >> eventually. > >If you're interested in univariate KDE, then we do have the Epanechnikov kernel. > >http://statsmodels.sourceforge.net/devel/generated/statsmodels.nonparametric. kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate.fit > >Skipper >_______________________________________________ >SciPy-User mailing list >SciPy-User at scipy.org >http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Sat Jan 19 08:40:42 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 19 Jan 2013 08:40:42 -0500 Subject: [SciPy-User] Epanechnikov kernel In-Reply-To: References: <6786840.3047071358595289503.JavaMail.defaultUser@defaultHost> Message-ID: On Sat, Jan 19, 2013 at 8:32 AM, Skipper Seabold wrote: > On Sat, Jan 19, 2013 at 7:49 AM, wrote: >> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it >> wrote: >>> Hi all, >>> >>> I have a question for you. Is it possible in scipy using a Epanechnikov >>> kernel function? >>> >>> I checked on scipy documentation but i found that the only way to calculate >>> kernel-density estimate is possible only with using Gaussian kernels? >>> >>> Is it true? >> >> Yes, kde in scipy.stats only has gaussian_kde >> >> Also in statsmodels currently only gaussian is supported for >> continuous data >> http://statsmodels.sourceforge.net/devel/nonparametric.html >> (It was removed because in the references only the bandwidth selection >> made much difference in the estimation, but not the shape of the >> kernel. Other kernels for continuous variables will come back >> eventually. > > If you're interested in univariate KDE, then we do have the Epanechnikov kernel. > > http://statsmodels.sourceforge.net/devel/generated/statsmodels.nonparametric.kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate.fit oops, I was only looking at the fft part. I stand corrected. Josef > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From francescoboccacci at libero.it Sat Jan 19 08:48:49 2013 From: francescoboccacci at libero.it (francescoboccacci at libero.it) Date: Sat, 19 Jan 2013 14:48:49 +0100 (CET) Subject: [SciPy-User] R: Re: Epanechnikov kernel Message-ID: <31533968.3070111358603329134.JavaMail.defaultUser@defaultHost> Hi, is there a possibility to multivariate KDE using Epanechnikov kernel? my variables are X Y (point position) Thanks Francesco >----Messaggio originale---- >Da: jsseabold at gmail.com >Data: 19/01/2013 14.32 >A: "SciPy Users List" >Ogg: Re: [SciPy-User] Epanechnikov kernel > >On Sat, Jan 19, 2013 at 7:49 AM, wrote: >> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it >> wrote: >>> Hi all, >>> >>> I have a question for you. Is it possible in scipy using a Epanechnikov >>> kernel function? >>> >>> I checked on scipy documentation but i found that the only way to calculate >>> kernel-density estimate is possible only with using Gaussian kernels? >>> >>> Is it true? >> >> Yes, kde in scipy.stats only has gaussian_kde >> >> Also in statsmodels currently only gaussian is supported for >> continuous data >> http://statsmodels.sourceforge.net/devel/nonparametric.html >> (It was removed because in the references only the bandwidth selection >> made much difference in the estimation, but not the shape of the >> kernel. Other kernels for continuous variables will come back >> eventually. > >If you're interested in univariate KDE, then we do have the Epanechnikov kernel. > >http://statsmodels.sourceforge.net/devel/generated/statsmodels.nonparametric. kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate.fit > >Skipper >_______________________________________________ >SciPy-User mailing list >SciPy-User at scipy.org >http://mail.scipy.org/mailman/listinfo/scipy-user > From jsseabold at gmail.com Sat Jan 19 09:21:47 2013 From: jsseabold at gmail.com (Skipper Seabold) Date: Sat, 19 Jan 2013 09:21:47 -0500 Subject: [SciPy-User] R: Re: Epanechnikov kernel In-Reply-To: <31533968.3070111358603329134.JavaMail.defaultUser@defaultHost> References: <31533968.3070111358603329134.JavaMail.defaultUser@defaultHost> Message-ID: On Sat, Jan 19, 2013 at 8:48 AM, francescoboccacci at libero.it wrote: > Hi, > is there a possibility to multivariate KDE using Epanechnikov kernel? my > variables are X Y (point position) > As Josef mentioned there is no way for the user to choose the kernel at present. The functionality is there, but it needs to be hooked in with a suitable API. I didn't keep up with these discussions, so I don't know the current status. If it's something you're interested in trying to help with, I'm sure people would be appreciative and you can ping the statsmodels mailing list. Practically though, the reason this hasn't been done yet is that the choice of the kernel is not all that important. Bandwidth selection is the most important variable and other kernels perform similarly given a good bandwidth. Is there any particular reason you want Epanechnikov kernel in particular? Skipper > Thanks > > Francesco > >>----Messaggio originale---- >>Da: jsseabold at gmail.com >>Data: 19/01/2013 14.32 >>A: "SciPy Users List" >>Ogg: Re: [SciPy-User] Epanechnikov kernel >> >>On Sat, Jan 19, 2013 at 7:49 AM, wrote: >>> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it >>> wrote: >>>> Hi all, >>>> >>>> I have a question for you. Is it possible in scipy using a Epanechnikov >>>> kernel function? >>>> >>>> I checked on scipy documentation but i found that the only way to > calculate >>>> kernel-density estimate is possible only with using Gaussian kernels? >>>> >>>> Is it true? >>> >>> Yes, kde in scipy.stats only has gaussian_kde >>> >>> Also in statsmodels currently only gaussian is supported for >>> continuous data >>> http://statsmodels.sourceforge.net/devel/nonparametric.html >>> (It was removed because in the references only the bandwidth selection >>> made much difference in the estimation, but not the shape of the >>> kernel. Other kernels for continuous variables will come back >>> eventually. >> >>If you're interested in univariate KDE, then we do have the Epanechnikov > kernel. >> >>http://statsmodels.sourceforge.net/devel/generated/statsmodels.nonparametric. > kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate.fit >> >>Skipper >>_______________________________________________ >>SciPy-User mailing list >>SciPy-User at scipy.org >>http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From francescoboccacci at libero.it Sat Jan 19 09:57:24 2013 From: francescoboccacci at libero.it (francescoboccacci at libero.it) Date: Sat, 19 Jan 2013 15:57:24 +0100 (CET) Subject: [SciPy-User] R: Re: R: Re: Epanechnikov kernel Message-ID: <7215417.3084671358607444348.JavaMail.defaultUser@defaultHost> Hi, i would like to use a Epanechnikov kernel because i would like replicate an R function that use Epanechnikov kernel. Reading in depth a documentation below documentation: http://rgm3.lab.nig.ac.jp/RGM/r_function?p=adehabitatHR&f=kernelUD i found that i can use normal kernel (i think guaussion kernel). Below i write a pieces of my code: xmin = min(xPoints) xmax = max(xPoints) ymin = min(yPoints) ymax = max(yPoints) X,Y = np.mgrid[xmin:xmax:40j, ymin:ymax:40j] positions = np.vstack([X.ravel(), Y.ravel()]) values = np.vstack([xPoints,yPoints]) # scipy.stats.kde.gaussian_kde -- # Representation of a kernel-density estimate using Gaussian kernels. kernel = stats.kde.gaussian_kde(values) Z = np.reshape(kernel(positions).T, X.T.shape) If i understood in right way the missing part that i have to implement is the smoothing paramter h: h = Sigma*n^(-1/6) where Sigma = 0.5*(sd(x)+sd(y)) My new question is: How can set smooting parameter in stats.kde.gaussian_kde function? is it possible? Thanks Francesco >----Messaggio originale---- >Da: jsseabold at gmail.com >Data: 19/01/2013 15.21 >A: "francescoboccacci at libero.it", "SciPy Users List" >Ogg: Re: [SciPy-User] R: Re: Epanechnikov kernel > >On Sat, Jan 19, 2013 at 8:48 AM, francescoboccacci at libero.it > wrote: >> Hi, >> is there a possibility to multivariate KDE using Epanechnikov kernel? my >> variables are X Y (point position) >> > >As Josef mentioned there is no way for the user to choose the kernel >at present. The functionality is there, but it needs to be hooked in >with a suitable API. I didn't keep up with these discussions, so I >don't know the current status. If it's something you're interested in >trying to help with, I'm sure people would be appreciative and you can >ping the statsmodels mailing list. > >Practically though, the reason this hasn't been done yet is that the >choice of the kernel is not all that important. Bandwidth selection is >the most important variable and other kernels perform similarly given >a good bandwidth. Is there any particular reason you want Epanechnikov >kernel in particular? > >Skipper > >> Thanks >> >> Francesco >> >>>----Messaggio originale---- >>>Da: jsseabold at gmail.com >>>Data: 19/01/2013 14.32 >>>A: "SciPy Users List" >>>Ogg: Re: [SciPy-User] Epanechnikov kernel >>> >>>On Sat, Jan 19, 2013 at 7:49 AM, wrote: >>>> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it >>>> wrote: >>>>> Hi all, >>>>> >>>>> I have a question for you. Is it possible in scipy using a Epanechnikov >>>>> kernel function? >>>>> >>>>> I checked on scipy documentation but i found that the only way to >> calculate >>>>> kernel-density estimate is possible only with using Gaussian kernels? >>>>> >>>>> Is it true? >>>> >>>> Yes, kde in scipy.stats only has gaussian_kde >>>> >>>> Also in statsmodels currently only gaussian is supported for >>>> continuous data >>>> http://statsmodels.sourceforge.net/devel/nonparametric.html >>>> (It was removed because in the references only the bandwidth selection >>>> made much difference in the estimation, but not the shape of the >>>> kernel. Other kernels for continuous variables will come back >>>> eventually. >>> >>>If you're interested in univariate KDE, then we do have the Epanechnikov >> kernel. >>> >>>http://statsmodels.sourceforge.net/devel/generated/statsmodels. nonparametric. >> kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate.fit >>> >>>Skipper >>>_______________________________________________ >>>SciPy-User mailing list >>>SciPy-User at scipy.org >>>http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Sat Jan 19 10:06:00 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 19 Jan 2013 10:06:00 -0500 Subject: [SciPy-User] R: Re: R: Re: Epanechnikov kernel In-Reply-To: <7215417.3084671358607444348.JavaMail.defaultUser@defaultHost> References: <7215417.3084671358607444348.JavaMail.defaultUser@defaultHost> Message-ID: On Sat, Jan 19, 2013 at 9:57 AM, francescoboccacci at libero.it wrote: > Hi, > i would like to use a Epanechnikov kernel because i would like replicate an R > function that use Epanechnikov kernel. > Reading in depth a documentation below documentation: > > > http://rgm3.lab.nig.ac.jp/RGM/r_function?p=adehabitatHR&f=kernelUD > > i found that i can use normal kernel (i think guaussion kernel). > Below i write a pieces of my code: > > > xmin = min(xPoints) > xmax = max(xPoints) > ymin = min(yPoints) > ymax = max(yPoints) > X,Y = np.mgrid[xmin:xmax:40j, ymin:ymax:40j] > positions = np.vstack([X.ravel(), Y.ravel()]) > values = np.vstack([xPoints,yPoints]) > # scipy.stats.kde.gaussian_kde -- > # Representation of a kernel-density estimate using Gaussian > kernels. > kernel = stats.kde.gaussian_kde(values) > > Z = np.reshape(kernel(positions).T, X.T.shape) > > If i understood in right way the missing part that i have to implement is the > smoothing paramter h: > > h = Sigma*n^(-1/6) > > where > > Sigma = 0.5*(sd(x)+sd(y)) > > > My new question is: > > How can set smooting parameter in stats.kde.gaussian_kde function? is it > possible? In a recent scipy (since 0.10 IIRC) you can directly set the bandwidth without subclassing http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html#scipy.stats.gaussian_kde http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#kernel-density-estimation Josef > > Thanks > > Francesco > > >>----Messaggio originale---- >>Da: jsseabold at gmail.com >>Data: 19/01/2013 15.21 >>A: "francescoboccacci at libero.it", "SciPy Users > List" >>Ogg: Re: [SciPy-User] R: Re: Epanechnikov kernel >> >>On Sat, Jan 19, 2013 at 8:48 AM, francescoboccacci at libero.it >> wrote: >>> Hi, >>> is there a possibility to multivariate KDE using Epanechnikov kernel? my >>> variables are X Y (point position) >>> >> >>As Josef mentioned there is no way for the user to choose the kernel >>at present. The functionality is there, but it needs to be hooked in >>with a suitable API. I didn't keep up with these discussions, so I >>don't know the current status. If it's something you're interested in >>trying to help with, I'm sure people would be appreciative and you can >>ping the statsmodels mailing list. >> >>Practically though, the reason this hasn't been done yet is that the >>choice of the kernel is not all that important. Bandwidth selection is >>the most important variable and other kernels perform similarly given >>a good bandwidth. Is there any particular reason you want Epanechnikov >>kernel in particular? >> >>Skipper >> >>> Thanks >>> >>> Francesco >>> >>>>----Messaggio originale---- >>>>Da: jsseabold at gmail.com >>>>Data: 19/01/2013 14.32 >>>>A: "SciPy Users List" >>>>Ogg: Re: [SciPy-User] Epanechnikov kernel >>>> >>>>On Sat, Jan 19, 2013 at 7:49 AM, wrote: >>>>> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it >>>>> wrote: >>>>>> Hi all, >>>>>> >>>>>> I have a question for you. Is it possible in scipy using a Epanechnikov >>>>>> kernel function? >>>>>> >>>>>> I checked on scipy documentation but i found that the only way to >>> calculate >>>>>> kernel-density estimate is possible only with using Gaussian kernels? >>>>>> >>>>>> Is it true? >>>>> >>>>> Yes, kde in scipy.stats only has gaussian_kde >>>>> >>>>> Also in statsmodels currently only gaussian is supported for >>>>> continuous data >>>>> http://statsmodels.sourceforge.net/devel/nonparametric.html >>>>> (It was removed because in the references only the bandwidth selection >>>>> made much difference in the estimation, but not the shape of the >>>>> kernel. Other kernels for continuous variables will come back >>>>> eventually. >>>> >>>>If you're interested in univariate KDE, then we do have the Epanechnikov >>> kernel. >>>> >>>>http://statsmodels.sourceforge.net/devel/generated/statsmodels. > nonparametric. >>> kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate.fit >>>> >>>>Skipper >>>>_______________________________________________ >>>>SciPy-User mailing list >>>>SciPy-User at scipy.org >>>>http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From francescoboccacci at libero.it Sat Jan 19 10:18:01 2013 From: francescoboccacci at libero.it (francescoboccacci at libero.it) Date: Sat, 19 Jan 2013 16:18:01 +0100 (CET) Subject: [SciPy-User] R: Re: R: Re: R: Re: Epanechnikov kernel Message-ID: <29952172.3089071358608681087.JavaMail.defaultUser@defaultHost> Thanks Josef, i will investigate on it. I'm using scipy version '0.9.0' so i need to update it. If i have some problems i will ask you again :). Thanks for your time Francesco >----Messaggio originale---- >Da: josef.pktd at gmail.com >Data: 19/01/2013 16.06 >A: "francescoboccacci at libero.it", "SciPy Users List" >Ogg: Re: [SciPy-User] R: Re: R: Re: Epanechnikov kernel > >On Sat, Jan 19, 2013 at 9:57 AM, francescoboccacci at libero.it > wrote: >> Hi, >> i would like to use a Epanechnikov kernel because i would like replicate an R >> function that use Epanechnikov kernel. >> Reading in depth a documentation below documentation: >> >> >> http://rgm3.lab.nig.ac.jp/RGM/r_function?p=adehabitatHR&f=kernelUD >> >> i found that i can use normal kernel (i think guaussion kernel). >> Below i write a pieces of my code: >> >> >> xmin = min(xPoints) >> xmax = max(xPoints) >> ymin = min(yPoints) >> ymax = max(yPoints) >> X,Y = np.mgrid[xmin:xmax:40j, ymin:ymax:40j] >> positions = np.vstack([X.ravel(), Y.ravel()]) >> values = np.vstack([xPoints,yPoints]) >> # scipy.stats.kde.gaussian_kde -- >> # Representation of a kernel-density estimate using Gaussian >> kernels. >> kernel = stats.kde.gaussian_kde(values) >> >> Z = np.reshape(kernel(positions).T, X.T.shape) >> >> If i understood in right way the missing part that i have to implement is the >> smoothing paramter h: >> >> h = Sigma*n^(-1/6) >> >> where >> >> Sigma = 0.5*(sd(x)+sd(y)) >> >> >> My new question is: >> >> How can set smooting parameter in stats.kde.gaussian_kde function? is it >> possible? > >In a recent scipy (since 0.10 IIRC) you can directly set the bandwidth >without subclassing > >http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde. html#scipy.stats.gaussian_kde >http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#kernel-density- estimation > >Josef > >> >> Thanks >> >> Francesco >> >> >>>----Messaggio originale---- >>>Da: jsseabold at gmail.com >>>Data: 19/01/2013 15.21 >>>A: "francescoboccacci at libero.it", "SciPy Users >> List" >>>Ogg: Re: [SciPy-User] R: Re: Epanechnikov kernel >>> >>>On Sat, Jan 19, 2013 at 8:48 AM, francescoboccacci at libero.it >>> wrote: >>>> Hi, >>>> is there a possibility to multivariate KDE using Epanechnikov kernel? my >>>> variables are X Y (point position) >>>> >>> >>>As Josef mentioned there is no way for the user to choose the kernel >>>at present. The functionality is there, but it needs to be hooked in >>>with a suitable API. I didn't keep up with these discussions, so I >>>don't know the current status. If it's something you're interested in >>>trying to help with, I'm sure people would be appreciative and you can >>>ping the statsmodels mailing list. >>> >>>Practically though, the reason this hasn't been done yet is that the >>>choice of the kernel is not all that important. Bandwidth selection is >>>the most important variable and other kernels perform similarly given >>>a good bandwidth. Is there any particular reason you want Epanechnikov >>>kernel in particular? >>> >>>Skipper >>> >>>> Thanks >>>> >>>> Francesco >>>> >>>>>----Messaggio originale---- >>>>>Da: jsseabold at gmail.com >>>>>Data: 19/01/2013 14.32 >>>>>A: "SciPy Users List" >>>>>Ogg: Re: [SciPy-User] Epanechnikov kernel >>>>> >>>>>On Sat, Jan 19, 2013 at 7:49 AM, wrote: >>>>>> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it >>>>>> wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> I have a question for you. Is it possible in scipy using a Epanechnikov >>>>>>> kernel function? >>>>>>> >>>>>>> I checked on scipy documentation but i found that the only way to >>>> calculate >>>>>>> kernel-density estimate is possible only with using Gaussian kernels? >>>>>>> >>>>>>> Is it true? >>>>>> >>>>>> Yes, kde in scipy.stats only has gaussian_kde >>>>>> >>>>>> Also in statsmodels currently only gaussian is supported for >>>>>> continuous data >>>>>> http://statsmodels.sourceforge.net/devel/nonparametric.html >>>>>> (It was removed because in the references only the bandwidth selection >>>>>> made much difference in the estimation, but not the shape of the >>>>>> kernel. Other kernels for continuous variables will come back >>>>>> eventually. >>>>> >>>>>If you're interested in univariate KDE, then we do have the Epanechnikov >>>> kernel. >>>>> >>>>>http://statsmodels.sourceforge.net/devel/generated/statsmodels. >> nonparametric. >>>> kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate. fit >>>>> >>>>>Skipper >>>>>_______________________________________________ >>>>>SciPy-User mailing list >>>>>SciPy-User at scipy.org >>>>>http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > From patrickmarshwx at gmail.com Sat Jan 19 12:46:40 2013 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Sat, 19 Jan 2013 11:46:40 -0600 Subject: [SciPy-User] R: Re: R: Re: R: Re: Epanechnikov kernel In-Reply-To: <29952172.3089071358608681087.JavaMail.defaultUser@defaultHost> References: <29952172.3089071358608681087.JavaMail.defaultUser@defaultHost> Message-ID: I apologize if this is a duplicate...I used the wrong email initially and wasn't sure if it would go through the listserv.... I've previously coded up a Cython version of the Epanechnikov kernel. You can find the function here: https://gist.github.com/4573808 It's certainly not optimized. It was a quick hack for use with rare (spatial) meteorological events. As the grid density increases, the performance decreases significantly. At this point, your best bet would be to create a grid that has the weights of the Epanechnikov kernel, and do a FFT convolve between the two grids. A pseudocode example (that I believe should work) is shown below... ============================================ import numpy as np import scipy as sp import epanechnikov (from the gist linked to above) data_to_kde = ... # Your 2D array # Create a grid with a value of 1 at the midpoint raw_epan_grid = np.zeros((51, 51), dtype=np.float64) raw_epan_gird[25, 25] = 1 # Convert this binary grid into the weights of the Epanechnikov kernel bandwidth = 10 dx = 1 epan_kernel = epanechnikov(raw_epan_grid, bandwidth, dx) # Use FFTCONVOLVE to do the smoothing in Fourier space data_smoothed = sp.signal.fftconvolve(data_to_kde, epan_kernel, mode='same') ============================================ This is slower than the function linked above for sparse grids, but faster for dense grids. (The runtime of fftconvolve is dependent upon the size of your arrays, not the density.) Hope this helps Patrick --- Patrick Marsh Ph.D. Candidate / Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com On Sat, Jan 19, 2013 at 9:18 AM, francescoboccacci at libero.it < francescoboccacci at libero.it> wrote: > Thanks Josef, i will investigate on it. > I'm using scipy version '0.9.0' so i need to update it. > If i have some problems i will ask you again :). > Thanks for your time > > Francesco > > >----Messaggio originale---- > >Da: josef.pktd at gmail.com > >Data: 19/01/2013 16.06 > >A: "francescoboccacci at libero.it", "SciPy > Users > List" > >Ogg: Re: [SciPy-User] R: Re: R: Re: Epanechnikov kernel > > > >On Sat, Jan 19, 2013 at 9:57 AM, francescoboccacci at libero.it > > wrote: > >> Hi, > >> i would like to use a Epanechnikov kernel because i would like > replicate > an R > >> function that use Epanechnikov kernel. > >> Reading in depth a documentation below documentation: > >> > >> > >> http://rgm3.lab.nig.ac.jp/RGM/r_function?p=adehabitatHR&f=kernelUD > >> > >> i found that i can use normal kernel (i think guaussion kernel). > >> Below i write a pieces of my code: > >> > >> > >> xmin = min(xPoints) > >> xmax = max(xPoints) > >> ymin = min(yPoints) > >> ymax = max(yPoints) > >> X,Y = np.mgrid[xmin:xmax:40j, ymin:ymax:40j] > >> positions = np.vstack([X.ravel(), Y.ravel()]) > >> values = np.vstack([xPoints,yPoints]) > >> # scipy.stats.kde.gaussian_kde -- > >> # Representation of a kernel-density estimate using > Gaussian > >> kernels. > >> kernel = stats.kde.gaussian_kde(values) > >> > >> Z = np.reshape(kernel(positions).T, X.T.shape) > >> > >> If i understood in right way the missing part that i have to implement > is > the > >> smoothing paramter h: > >> > >> h = Sigma*n^(-1/6) > >> > >> where > >> > >> Sigma = 0.5*(sd(x)+sd(y)) > >> > >> > >> My new question is: > >> > >> How can set smooting parameter in stats.kde.gaussian_kde function? is it > >> possible? > > > >In a recent scipy (since 0.10 IIRC) you can directly set the bandwidth > >without subclassing > > > > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde. > html#scipy.stats.gaussian_kde > > > http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#kernel-density- > estimation > > > >Josef > > > >> > >> Thanks > >> > >> Francesco > >> > >> > >>>----Messaggio originale---- > >>>Da: jsseabold at gmail.com > >>>Data: 19/01/2013 15.21 > >>>A: "francescoboccacci at libero.it", "SciPy > Users > >> List" > >>>Ogg: Re: [SciPy-User] R: Re: Epanechnikov kernel > >>> > >>>On Sat, Jan 19, 2013 at 8:48 AM, francescoboccacci at libero.it > >>> wrote: > >>>> Hi, > >>>> is there a possibility to multivariate KDE using Epanechnikov > kernel? my > >>>> variables are X Y (point position) > >>>> > >>> > >>>As Josef mentioned there is no way for the user to choose the kernel > >>>at present. The functionality is there, but it needs to be hooked in > >>>with a suitable API. I didn't keep up with these discussions, so I > >>>don't know the current status. If it's something you're interested in > >>>trying to help with, I'm sure people would be appreciative and you can > >>>ping the statsmodels mailing list. > >>> > >>>Practically though, the reason this hasn't been done yet is that the > >>>choice of the kernel is not all that important. Bandwidth selection is > >>>the most important variable and other kernels perform similarly given > >>>a good bandwidth. Is there any particular reason you want Epanechnikov > >>>kernel in particular? > >>> > >>>Skipper > >>> > >>>> Thanks > >>>> > >>>> Francesco > >>>> > >>>>>----Messaggio originale---- > >>>>>Da: jsseabold at gmail.com > >>>>>Data: 19/01/2013 14.32 > >>>>>A: "SciPy Users List" > >>>>>Ogg: Re: [SciPy-User] Epanechnikov kernel > >>>>> > >>>>>On Sat, Jan 19, 2013 at 7:49 AM, wrote: > >>>>>> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it > >>>>>> wrote: > >>>>>>> Hi all, > >>>>>>> > >>>>>>> I have a question for you. Is it possible in scipy using a > Epanechnikov > >>>>>>> kernel function? > >>>>>>> > >>>>>>> I checked on scipy documentation but i found that the only way to > >>>> calculate > >>>>>>> kernel-density estimate is possible only with using Gaussian > kernels? > >>>>>>> > >>>>>>> Is it true? > >>>>>> > >>>>>> Yes, kde in scipy.stats only has gaussian_kde > >>>>>> > >>>>>> Also in statsmodels currently only gaussian is supported for > >>>>>> continuous data > >>>>>> http://statsmodels.sourceforge.net/devel/nonparametric.html > >>>>>> (It was removed because in the references only the bandwidth > selection > >>>>>> made much difference in the estimation, but not the shape of the > >>>>>> kernel. Other kernels for continuous variables will come back > >>>>>> eventually. > >>>>> > >>>>>If you're interested in univariate KDE, then we do have the > Epanechnikov > >>>> kernel. > >>>>> > >>>>>http://statsmodels.sourceforge.net/devel/generated/statsmodels. > >> nonparametric. > >>>> > kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate. > fit > >>>>> > >>>>>Skipper > >>>>>_______________________________________________ > >>>>>SciPy-User mailing list > >>>>>SciPy-User at scipy.org > >>>>>http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> SciPy-User mailing list > >>>> SciPy-User at scipy.org > >>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > >> > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrickmarshwx at gmail.com Sat Jan 19 12:47:38 2013 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Sat, 19 Jan 2013 11:47:38 -0600 Subject: [SciPy-User] R: Re: R: Re: R: Re: Epanechnikov kernel In-Reply-To: References: <29952172.3089071358608681087.JavaMail.defaultUser@defaultHost> Message-ID: I should also add that you can approximate an Epanechnikov kernel with a Gaussian kernel. See: http://journals.ametsoc.org/doi/pdf/10.1175/BAMS-D-11-00200.1 The take away line is: "Using the results of Marron and Nolan (1988), it can be shown that, when comparing Epanechnikov and Gaussian kernels, the Epanechnikov kernel must be 2.2138 times larger than the Gaussian bandwidth to achieve a similar response function." So, you can take the bandwidth you'd like to use with the Epanechnikov kernel, and divide it by 2.2138 and plug the result into the Gaussian kernel. It's not exact, but the response is similar. Patrick --- Patrick Marsh Ph.D. Candidate / Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com On Sat, Jan 19, 2013 at 11:46 AM, Patrick Marsh wrote: > I apologize if this is a duplicate...I used the wrong email initially and > wasn't sure if it would go through the listserv.... > > > > > I've previously coded up a Cython version of the Epanechnikov kernel. You > can find the function here: > > https://gist.github.com/4573808 > > It's certainly not optimized. It was a quick hack for use with rare > (spatial) meteorological events. As the grid density increases, the > performance decreases significantly. At this point, your best bet would be > to create a grid that has the weights of the Epanechnikov kernel, and do a > FFT convolve between the two grids. A pseudocode example (that I believe > should work) is shown below... > > > ============================================ > import numpy as np > import scipy as sp > import epanechnikov (from the gist linked to above) > > data_to_kde = ... # Your 2D array > > # Create a grid with a value of 1 at the midpoint > raw_epan_grid = np.zeros((51, 51), dtype=np.float64) > raw_epan_gird[25, 25] = 1 > > # Convert this binary grid into the weights of the Epanechnikov kernel > bandwidth = 10 > dx = 1 > epan_kernel = epanechnikov(raw_epan_grid, bandwidth, dx) > > # Use FFTCONVOLVE to do the smoothing in Fourier space > data_smoothed = sp.signal.fftconvolve(data_to_kde, epan_kernel, > mode='same') > ============================================ > > > This is slower than the function linked above for sparse grids, but faster > for dense grids. (The runtime of fftconvolve is dependent upon the size of > your arrays, not the density.) > > > Hope this helps > Patrick > > --- > Patrick Marsh > Ph.D. Candidate / Liaison to the HWT > School of Meteorology / University of Oklahoma > Cooperative Institute for Mesoscale Meteorological Studies > National Severe Storms Laboratory > http://www.patricktmarsh.com > > > On Sat, Jan 19, 2013 at 9:18 AM, francescoboccacci at libero.it < > francescoboccacci at libero.it> wrote: > >> Thanks Josef, i will investigate on it. >> I'm using scipy version '0.9.0' so i need to update it. >> If i have some problems i will ask you again :). >> Thanks for your time >> >> Francesco >> >> >----Messaggio originale---- >> >Da: josef.pktd at gmail.com >> >Data: 19/01/2013 16.06 >> >A: "francescoboccacci at libero.it", "SciPy >> Users >> List" >> >Ogg: Re: [SciPy-User] R: Re: R: Re: Epanechnikov kernel >> > >> >On Sat, Jan 19, 2013 at 9:57 AM, francescoboccacci at libero.it >> > wrote: >> >> Hi, >> >> i would like to use a Epanechnikov kernel because i would like >> replicate >> an R >> >> function that use Epanechnikov kernel. >> >> Reading in depth a documentation below documentation: >> >> >> >> >> >> http://rgm3.lab.nig.ac.jp/RGM/r_function?p=adehabitatHR&f=kernelUD >> >> >> >> i found that i can use normal kernel (i think guaussion kernel). >> >> Below i write a pieces of my code: >> >> >> >> >> >> xmin = min(xPoints) >> >> xmax = max(xPoints) >> >> ymin = min(yPoints) >> >> ymax = max(yPoints) >> >> X,Y = np.mgrid[xmin:xmax:40j, ymin:ymax:40j] >> >> positions = np.vstack([X.ravel(), Y.ravel()]) >> >> values = np.vstack([xPoints,yPoints]) >> >> # scipy.stats.kde.gaussian_kde -- >> >> # Representation of a kernel-density estimate using >> Gaussian >> >> kernels. >> >> kernel = stats.kde.gaussian_kde(values) >> >> >> >> Z = np.reshape(kernel(positions).T, X.T.shape) >> >> >> >> If i understood in right way the missing part that i have to implement >> is >> the >> >> smoothing paramter h: >> >> >> >> h = Sigma*n^(-1/6) >> >> >> >> where >> >> >> >> Sigma = 0.5*(sd(x)+sd(y)) >> >> >> >> >> >> My new question is: >> >> >> >> How can set smooting parameter in stats.kde.gaussian_kde function? is >> it >> >> possible? >> > >> >In a recent scipy (since 0.10 IIRC) you can directly set the bandwidth >> >without subclassing >> > >> > >> http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde. >> html#scipy.stats.gaussian_kde >> > >> http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#kernel-density- >> estimation >> > >> >Josef >> > >> >> >> >> Thanks >> >> >> >> Francesco >> >> >> >> >> >>>----Messaggio originale---- >> >>>Da: jsseabold at gmail.com >> >>>Data: 19/01/2013 15.21 >> >>>A: "francescoboccacci at libero.it", "SciPy >> Users >> >> List" >> >>>Ogg: Re: [SciPy-User] R: Re: Epanechnikov kernel >> >>> >> >>>On Sat, Jan 19, 2013 at 8:48 AM, francescoboccacci at libero.it >> >>> wrote: >> >>>> Hi, >> >>>> is there a possibility to multivariate KDE using Epanechnikov >> kernel? my >> >>>> variables are X Y (point position) >> >>>> >> >>> >> >>>As Josef mentioned there is no way for the user to choose the kernel >> >>>at present. The functionality is there, but it needs to be hooked in >> >>>with a suitable API. I didn't keep up with these discussions, so I >> >>>don't know the current status. If it's something you're interested in >> >>>trying to help with, I'm sure people would be appreciative and you can >> >>>ping the statsmodels mailing list. >> >>> >> >>>Practically though, the reason this hasn't been done yet is that the >> >>>choice of the kernel is not all that important. Bandwidth selection is >> >>>the most important variable and other kernels perform similarly given >> >>>a good bandwidth. Is there any particular reason you want Epanechnikov >> >>>kernel in particular? >> >>> >> >>>Skipper >> >>> >> >>>> Thanks >> >>>> >> >>>> Francesco >> >>>> >> >>>>>----Messaggio originale---- >> >>>>>Da: jsseabold at gmail.com >> >>>>>Data: 19/01/2013 14.32 >> >>>>>A: "SciPy Users List" >> >>>>>Ogg: Re: [SciPy-User] Epanechnikov kernel >> >>>>> >> >>>>>On Sat, Jan 19, 2013 at 7:49 AM, wrote: >> >>>>>> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it >> >>>>>> wrote: >> >>>>>>> Hi all, >> >>>>>>> >> >>>>>>> I have a question for you. Is it possible in scipy using a >> Epanechnikov >> >>>>>>> kernel function? >> >>>>>>> >> >>>>>>> I checked on scipy documentation but i found that the only way to >> >>>> calculate >> >>>>>>> kernel-density estimate is possible only with using Gaussian >> kernels? >> >>>>>>> >> >>>>>>> Is it true? >> >>>>>> >> >>>>>> Yes, kde in scipy.stats only has gaussian_kde >> >>>>>> >> >>>>>> Also in statsmodels currently only gaussian is supported for >> >>>>>> continuous data >> >>>>>> http://statsmodels.sourceforge.net/devel/nonparametric.html >> >>>>>> (It was removed because in the references only the bandwidth >> selection >> >>>>>> made much difference in the estimation, but not the shape of the >> >>>>>> kernel. Other kernels for continuous variables will come back >> >>>>>> eventually. >> >>>>> >> >>>>>If you're interested in univariate KDE, then we do have the >> Epanechnikov >> >>>> kernel. >> >>>>> >> >>>>>http://statsmodels.sourceforge.net/devel/generated/statsmodels. >> >> nonparametric. >> >>>> >> kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate. >> fit >> >>>>> >> >>>>>Skipper >> >>>>>_______________________________________________ >> >>>>>SciPy-User mailing list >> >>>>>SciPy-User at scipy.org >> >>>>>http://mail.scipy.org/mailman/listinfo/scipy-user >> >>>>> >> >>>> >> >>>> >> >>>> _______________________________________________ >> >>>> SciPy-User mailing list >> >>>> SciPy-User at scipy.org >> >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >>> >> >> >> >> >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From francescoboccacci at libero.it Sat Jan 19 13:39:02 2013 From: francescoboccacci at libero.it (francescoboccacci at libero.it) Date: Sat, 19 Jan 2013 19:39:02 +0100 (CET) Subject: [SciPy-User] R: Re: R: Re: R: Re: R: Re: Epanechnikov kernel Message-ID: <18113989.3127151358620742063.JavaMail.defaultUser@defaultHost> Great. Thanks Patrick.I'll let you know. Francesco ----Messaggio originale---- Da: patrickmarshwx at gmail.com Data: 19/01/2013 18.47 A: "francescoboccacci at libero.it", "SciPy Users List" Ogg: Re: [SciPy-User] R: Re: R: Re: R: Re: Epanechnikov kernel I should also add that you can approximate an Epanechnikov kernel with a Gaussian kernel. See: http://journals.ametsoc.org/doi/pdf/10.1175/BAMS-D-11-00200.1 The take away line is: "Using the results of Marron and Nolan (1988), it can be shown that, when comparing Epanechnikov and Gaussian kernels, the Epanechnikov kernel must be 2.2138 times larger than the Gaussian bandwidth to achieve a similar response function." So, you can take the bandwidth you'd like to use with the Epanechnikov kernel, and divide it by 2.2138 and plug the result into the Gaussian kernel. It's not exact, but the response is similar. Patrick --- Patrick Marsh Ph.D. Candidate / Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com On Sat, Jan 19, 2013 at 11:46 AM, Patrick Marsh wrote: I apologize if this is a duplicate...I used the wrong email initially and wasn't sure if it would go through the listserv.... I've previously coded up a Cython version of the Epanechnikov kernel. You can find the function here: https://gist.github.com/4573808 It's certainly not optimized. It was a quick hack for use with rare (spatial) meteorological events. As the grid density increases, the performance decreases significantly. At this point, your best bet would be to create a grid that has the weights of the Epanechnikov kernel, and do a FFT convolve between the two grids. A pseudocode example (that I believe should work) is shown below... ============================================ import numpy as npimport scipy as sp import epanechnikov (from the gist linked to above) data_to_kde = ... # Your 2D array # Create a grid with a value of 1 at the midpoint raw_epan_grid = np.zeros((51, 51), dtype=np.float64)raw_epan_gird[25, 25] = 1 # Convert this binary grid into the weights of the Epanechnikov kernel bandwidth = 10dx = 1 epan_kernel = epanechnikov(raw_epan_grid, bandwidth, dx) # Use FFTCONVOLVE to do the smoothing in Fourier space data_smoothed = sp.signal.fftconvolve(data_to_kde, epan_kernel, mode='same')============================================ This is slower than the function linked above for sparse grids, but faster for dense grids. (The runtime of fftconvolve is dependent upon the size of your arrays, not the density.) Hope this helps Patrick--- Patrick Marsh Ph.D. Candidate / Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com On Sat, Jan 19, 2013 at 9:18 AM, francescoboccacci at libero.it wrote: Thanks Josef, i will investigate on it. I'm using scipy version '0.9.0' so i need to update it. If i have some problems i will ask you again :). Thanks for your time Francesco >----Messaggio originale---- >Da: josef.pktd at gmail.com >Data: 19/01/2013 16.06 >A: "francescoboccacci at libero.it", "SciPy Users List" >Ogg: Re: [SciPy-User] R: Re: R: Re: Epanechnikov kernel > >On Sat, Jan 19, 2013 at 9:57 AM, francescoboccacci at libero.it > wrote: >> Hi, >> i would like to use a Epanechnikov kernel because i would like replicate an R >> function that use Epanechnikov kernel. >> Reading in depth a documentation below documentation: >> >> >> http://rgm3.lab.nig.ac.jp/RGM/r_function?p=adehabitatHR&f=kernelUD >> >> i found that i can use normal kernel (i think guaussion kernel). >> Below i write a pieces of my code: >> >> >> xmin = min(xPoints) >> xmax = max(xPoints) >> ymin = min(yPoints) >> ymax = max(yPoints) >> X,Y = np.mgrid[xmin:xmax:40j, ymin:ymax:40j] >> positions = np.vstack([X.ravel(), Y.ravel()]) >> values = np.vstack([xPoints,yPoints]) >> # scipy.stats.kde.gaussian_kde -- >> # Representation of a kernel-density estimate using Gaussian >> kernels. >> kernel = stats.kde.gaussian_kde(values) >> >> Z = np.reshape(kernel(positions).T, X.T.shape) >> >> If i understood in right way the missing part that i have to implement is the >> smoothing paramter h: >> >> h = Sigma*n^(-1/6) >> >> where >> >> Sigma = 0.5*(sd(x)+sd(y)) >> >> >> My new question is: >> >> How can set smooting parameter in stats.kde.gaussian_kde function? is it >> possible? > >In a recent scipy (since 0.10 IIRC) you can directly set the bandwidth >without subclassing > >http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde. html#scipy.stats.gaussian_kde >http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#kernel-density- estimation > >Josef > >> >> Thanks >> >> Francesco >> >> >>>----Messaggio originale---- >>>Da: jsseabold at gmail.com >>>Data: 19/01/2013 15.21 >>>A: "francescoboccacci at libero.it", "SciPy Users >> List" >>>Ogg: Re: [SciPy-User] R: Re: Epanechnikov kernel >>> >>>On Sat, Jan 19, 2013 at 8:48 AM, francescoboccacci at libero.it >>> wrote: >>>> Hi, >>>> is there a possibility to multivariate KDE using Epanechnikov kernel? my >>>> variables are X Y (point position) >>>> >>> >>>As Josef mentioned there is no way for the user to choose the kernel >>>at present. The functionality is there, but it needs to be hooked in >>>with a suitable API. I didn't keep up with these discussions, so I >>>don't know the current status. If it's something you're interested in >>>trying to help with, I'm sure people would be appreciative and you can >>>ping the statsmodels mailing list. >>> >>>Practically though, the reason this hasn't been done yet is that the >>>choice of the kernel is not all that important. Bandwidth selection is >>>the most important variable and other kernels perform similarly given >>>a good bandwidth. Is there any particular reason you want Epanechnikov >>>kernel in particular? >>> >>>Skipper >>> >>>> Thanks >>>> >>>> Francesco >>>> >>>>>----Messaggio originale---- >>>>>Da: jsseabold at gmail.com >>>>>Data: 19/01/2013 14.32 >>>>>A: "SciPy Users List" >>>>>Ogg: Re: [SciPy-User] Epanechnikov kernel >>>>> >>>>>On Sat, Jan 19, 2013 at 7:49 AM, wrote: >>>>>> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it >>>>>> wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> I have a question for you. Is it possible in scipy using a Epanechnikov >>>>>>> kernel function? >>>>>>> >>>>>>> I checked on scipy documentation but i found that the only way to >>>> calculate >>>>>>> kernel-density estimate is possible only with using Gaussian kernels? >>>>>>> >>>>>>> Is it true? >>>>>> >>>>>> Yes, kde in scipy.stats only has gaussian_kde >>>>>> >>>>>> Also in statsmodels currently only gaussian is supported for >>>>>> continuous data >>>>>> http://statsmodels.sourceforge.net/devel/nonparametric.html >>>>>> (It was removed because in the references only the bandwidth selection >>>>>> made much difference in the estimation, but not the shape of the >>>>>> kernel. Other kernels for continuous variables will come back >>>>>> eventually. >>>>> >>>>>If you're interested in univariate KDE, then we do have the Epanechnikov >>>> kernel. >>>>> >>>>>http://statsmodels.sourceforge.net/devel/generated/statsmodels. >> nonparametric. >>>> kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate. fit >>>>> >>>>>Skipper >>>>>_______________________________________________ >>>>>SciPy-User mailing list >>>>>SciPy-User at scipy.org >>>>>http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From kelson924 at aol.com Sat Jan 19 04:20:56 2013 From: kelson924 at aol.com (Kelson Zawack) Date: Sat, 19 Jan 2013 04:20:56 -0500 Subject: [SciPy-User] Hierarchical Clustering Message-ID: <50FA6578.3020401@aol.com> I have a matrix of n observations of length m I would like to cluster. The documentation for scipy.cluster.hierarchy.linkage says it takes 'A condensed or redundant distance matrix... Alternatively, a collection of m observation vectors in n dimensions may be passed as an m by n array.' I tried passing in the condensed matrix returned from scipy.spatial.distance.pdist, the matrix returned from calling scipy.spatial.distance.squarefrom on the previous matrix, and the raw data matrix along with single link as the method and euclidean as the distance measure and got 3 different answers. I then tried it with toy data like observations of [[0,0,0,], [1,1,1], [5,5,5], [6,6,6]] and they all give the same answer. This makes me very nervous. What is the correct way the call the function and how can I be sure of this? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: f5047d1e0cbb50ec208923a22cd517c55100fa7b.png Type: image/png Size: 216 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 174fadd07fd54c9afe288e96558c92e0c1da733a.png Type: image/png Size: 202 bytes Desc: not available URL: From patrick.marsh at noaa.gov Sat Jan 19 12:40:26 2013 From: patrick.marsh at noaa.gov (Patrick Marsh) Date: Sat, 19 Jan 2013 11:40:26 -0600 Subject: [SciPy-User] R: Re: R: Re: R: Re: Epanechnikov kernel In-Reply-To: <29952172.3089071358608681087.JavaMail.defaultUser@defaultHost> References: <29952172.3089071358608681087.JavaMail.defaultUser@defaultHost> Message-ID: I've previously coded up a Cython version of the Epanechnikov kernel. You can find the function here: https://gist.github.com/4573808 It's certainly not optimized. It was a quick hack for use with rare (spatial) meteorological events. As the grid density increases, the performance decreases significantly. At this point, your best bet would be to create a grid that has the weights of the Epanechnikov kernel, and do a FFT convolve between the two grids. A pseudocode example (that I believe should work) is shown below... ============================================ import numpy as np import scipy as sp import epanechnikov (from the gist linked to above) data_to_kde = ... # Your 2D array # Create a grid with a value of 1 at the midpoint raw_epan_grid = np.zeros((51, 51), dtype=np.float64) raw_epan_gird[25, 25] = 1 # Convert this binary grid into the weights of the Epanechnikov kernel bandwidth = 10 dx = 1 epan_kernel = epanechnikov(raw_epan_grid, bandwidth, dx) # Use FFTCONVOLVE to do the smoothing in Fourier space data_smoothed = sp.signal.fftconvolve(data_to_kde, epan_kernel, mode='same') ============================================ This is slower than the function linked above for sparse grids, but faster for dense grids. (The runtime of fftconvolve is dependent upon the size of your arrays, not the density.) Hope this helps Patrick --- Patrick Marsh Ph.D. Candidate / Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com On Sat, Jan 19, 2013 at 9:18 AM, francescoboccacci at libero.it < francescoboccacci at libero.it> wrote: > Thanks Josef, i will investigate on it. > I'm using scipy version '0.9.0' so i need to update it. > If i have some problems i will ask you again :). > Thanks for your time > > Francesco > > >----Messaggio originale---- > >Da: josef.pktd at gmail.com > >Data: 19/01/2013 16.06 > >A: "francescoboccacci at libero.it", "SciPy > Users > List" > >Ogg: Re: [SciPy-User] R: Re: R: Re: Epanechnikov kernel > > > >On Sat, Jan 19, 2013 at 9:57 AM, francescoboccacci at libero.it > > wrote: > >> Hi, > >> i would like to use a Epanechnikov kernel because i would like > replicate > an R > >> function that use Epanechnikov kernel. > >> Reading in depth a documentation below documentation: > >> > >> > >> http://rgm3.lab.nig.ac.jp/RGM/r_function?p=adehabitatHR&f=kernelUD > >> > >> i found that i can use normal kernel (i think guaussion kernel). > >> Below i write a pieces of my code: > >> > >> > >> xmin = min(xPoints) > >> xmax = max(xPoints) > >> ymin = min(yPoints) > >> ymax = max(yPoints) > >> X,Y = np.mgrid[xmin:xmax:40j, ymin:ymax:40j] > >> positions = np.vstack([X.ravel(), Y.ravel()]) > >> values = np.vstack([xPoints,yPoints]) > >> # scipy.stats.kde.gaussian_kde -- > >> # Representation of a kernel-density estimate using > Gaussian > >> kernels. > >> kernel = stats.kde.gaussian_kde(values) > >> > >> Z = np.reshape(kernel(positions).T, X.T.shape) > >> > >> If i understood in right way the missing part that i have to implement > is > the > >> smoothing paramter h: > >> > >> h = Sigma*n^(-1/6) > >> > >> where > >> > >> Sigma = 0.5*(sd(x)+sd(y)) > >> > >> > >> My new question is: > >> > >> How can set smooting parameter in stats.kde.gaussian_kde function? is it > >> possible? > > > >In a recent scipy (since 0.10 IIRC) you can directly set the bandwidth > >without subclassing > > > > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde. > html#scipy.stats.gaussian_kde > > > http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#kernel-density- > estimation > > > >Josef > > > >> > >> Thanks > >> > >> Francesco > >> > >> > >>>----Messaggio originale---- > >>>Da: jsseabold at gmail.com > >>>Data: 19/01/2013 15.21 > >>>A: "francescoboccacci at libero.it", "SciPy > Users > >> List" > >>>Ogg: Re: [SciPy-User] R: Re: Epanechnikov kernel > >>> > >>>On Sat, Jan 19, 2013 at 8:48 AM, francescoboccacci at libero.it > >>> wrote: > >>>> Hi, > >>>> is there a possibility to multivariate KDE using Epanechnikov > kernel? my > >>>> variables are X Y (point position) > >>>> > >>> > >>>As Josef mentioned there is no way for the user to choose the kernel > >>>at present. The functionality is there, but it needs to be hooked in > >>>with a suitable API. I didn't keep up with these discussions, so I > >>>don't know the current status. If it's something you're interested in > >>>trying to help with, I'm sure people would be appreciative and you can > >>>ping the statsmodels mailing list. > >>> > >>>Practically though, the reason this hasn't been done yet is that the > >>>choice of the kernel is not all that important. Bandwidth selection is > >>>the most important variable and other kernels perform similarly given > >>>a good bandwidth. Is there any particular reason you want Epanechnikov > >>>kernel in particular? > >>> > >>>Skipper > >>> > >>>> Thanks > >>>> > >>>> Francesco > >>>> > >>>>>----Messaggio originale---- > >>>>>Da: jsseabold at gmail.com > >>>>>Data: 19/01/2013 14.32 > >>>>>A: "SciPy Users List" > >>>>>Ogg: Re: [SciPy-User] Epanechnikov kernel > >>>>> > >>>>>On Sat, Jan 19, 2013 at 7:49 AM, wrote: > >>>>>> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it > >>>>>> wrote: > >>>>>>> Hi all, > >>>>>>> > >>>>>>> I have a question for you. Is it possible in scipy using a > Epanechnikov > >>>>>>> kernel function? > >>>>>>> > >>>>>>> I checked on scipy documentation but i found that the only way to > >>>> calculate > >>>>>>> kernel-density estimate is possible only with using Gaussian > kernels? > >>>>>>> > >>>>>>> Is it true? > >>>>>> > >>>>>> Yes, kde in scipy.stats only has gaussian_kde > >>>>>> > >>>>>> Also in statsmodels currently only gaussian is supported for > >>>>>> continuous data > >>>>>> http://statsmodels.sourceforge.net/devel/nonparametric.html > >>>>>> (It was removed because in the references only the bandwidth > selection > >>>>>> made much difference in the estimation, but not the shape of the > >>>>>> kernel. Other kernels for continuous variables will come back > >>>>>> eventually. > >>>>> > >>>>>If you're interested in univariate KDE, then we do have the > Epanechnikov > >>>> kernel. > >>>>> > >>>>>http://statsmodels.sourceforge.net/devel/generated/statsmodels. > >> nonparametric. > >>>> > kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate. > fit > >>>>> > >>>>>Skipper > >>>>>_______________________________________________ > >>>>>SciPy-User mailing list > >>>>>SciPy-User at scipy.org > >>>>>http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> SciPy-User mailing list > >>>> SciPy-User at scipy.org > >>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > >> > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdt.bruck at gmail.com Sat Jan 19 22:52:45 2013 From: jdt.bruck at gmail.com (Jonathan Bruck) Date: Sun, 20 Jan 2013 14:52:45 +1100 Subject: [SciPy-User] ImportError for clapack and flapack Message-ID: Hi Scipy Users, I have installed numpy and scipy into a virtualenv using pip but I have import errors in the linalg libraries that prevent me from continuing to make SimpleCV work. running numpy.test() prints out no big messages like this. running scipy.test() prints out the following. In particular it is the import errors that are causing me problems. I am using Ubuntu 12.10 64-bit on my laptop, and I prepared for this installation by using "sudo apt-get build-dep numpy scipy" (along with plenty of other installs). Thanks in advance Regards, Jonathan ====================================================================== ERROR: Failure: ImportError (/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/clapack.so: undefined symbol: clapack_sgesv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/interpolate/__init__.py", line 154, in from rbf import Rbf File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/interpolate/rbf.py", line 49, in from scipy import linalg File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/__init__.py", line 133, in from basic import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/basic.py", line 12, in from lapack import get_lapack_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/lapack.py", line 15, in from scipy.linalg import clapack ImportError: /home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/clapack.so: undefined symbol: clapack_sgesv ====================================================================== ERROR: Failure: ImportError (/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/lib/lapack/clapack.so: undefined symbol: clapack_sgesv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/lib/lapack/__init__.py", line 148, in import clapack ImportError: /home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/lib/lapack/clapack.so: undefined symbol: clapack_sgesv ====================================================================== ERROR: Failure: ImportError (cannot import name flapack) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/__init__.py", line 133, in from basic import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/basic.py", line 12, in from lapack import get_lapack_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/lapack.py", line 14, in from scipy.linalg import flapack ImportError: cannot import name flapack ====================================================================== ERROR: test_common.test_pade_trivial ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/misc/tests/test_common.py", line 9, in test_pade_trivial nump, denomp = pade([1.0], 0) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/misc/common.py", line 371, in pade from scipy import linalg File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/__init__.py", line 133, in from basic import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/basic.py", line 12, in from lapack import get_lapack_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/lapack.py", line 14, in from scipy.linalg import flapack ImportError: cannot import name flapack ====================================================================== ERROR: test_common.test_pade_4term_exp ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/misc/tests/test_common.py", line 18, in test_pade_4term_exp nump, denomp = pade(an, 0) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/misc/common.py", line 371, in pade from scipy import linalg File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/__init__.py", line 133, in from basic import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/basic.py", line 12, in from lapack import get_lapack_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/lapack.py", line 14, in from scipy.linalg import flapack ImportError: cannot import name flapack ====================================================================== ERROR: Failure: ImportError (cannot import name flapack) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/optimize/__init__.py", line 146, in from _root import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/optimize/_root.py", line 17, in import nonlin File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/optimize/nonlin.py", line 116, in from scipy.linalg import norm, solve, inv, qr, svd, lstsq, LinAlgError File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/__init__.py", line 133, in from basic import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/basic.py", line 12, in from lapack import get_lapack_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/lapack.py", line 14, in from scipy.linalg import flapack ImportError: cannot import name flapack ====================================================================== ERROR: Failure: ImportError (cannot import name flapack) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/signal/__init__.py", line 218, in from cont2discrete import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/signal/cont2discrete.py", line 9, in from scipy import linalg File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/__init__.py", line 133, in from basic import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/basic.py", line 12, in from lapack import get_lapack_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/lapack.py", line 14, in from scipy.linalg import flapack ImportError: cannot import name flapack ====================================================================== ERROR: Failure: ImportError (cannot import name flapack) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/sparse/linalg/__init__.py", line 90, in from isolve import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/__init__.py", line 6, in from lgmres import lgmres File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/lgmres.py", line 5, in from scipy.linalg import get_blas_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/__init__.py", line 133, in from basic import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/basic.py", line 12, in from lapack import get_lapack_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/lapack.py", line 14, in from scipy.linalg import flapack ImportError: cannot import name flapack ====================================================================== ERROR: Failure: ImportError (cannot import name flapack) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/sparse/tests/test_base.py", line 34, in from scipy.sparse.linalg import splu File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/sparse/linalg/__init__.py", line 90, in from isolve import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/__init__.py", line 6, in from lgmres import lgmres File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/lgmres.py", line 5, in from scipy.linalg import get_blas_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/__init__.py", line 133, in from basic import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/basic.py", line 12, in from lapack import get_lapack_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/lapack.py", line 14, in from scipy.linalg import flapack ImportError: cannot import name flapack ====================================================================== ERROR: Failure: ImportError (cannot import name flapack) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/stats/__init__.py", line 321, in from stats import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/stats/stats.py", line 194, in import scipy.linalg as linalg File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/__init__.py", line 133, in from basic import * File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/basic.py", line 12, in from lapack import get_lapack_funcs File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/linalg/lapack.py", line 14, in from scipy.linalg import flapack ImportError: cannot import name flapack ====================================================================== FAIL: Test generator for parametric tests ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/misc/tests/test_pilutil.py", line 52, in tst_fromimage assert_(img.min() >= imin) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError ====================================================================== FAIL: Test generator for parametric tests ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/misc/tests/test_pilutil.py", line 52, in tst_fromimage assert_(img.min() >= imin) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError ====================================================================== FAIL: Test generator for parametric tests ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/misc/tests/test_pilutil.py", line 52, in tst_fromimage assert_(img.min() >= imin) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError ====================================================================== FAIL: test_io.test_imread ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/scipy/ndimage/tests/test_io.py", line 16, in test_imread assert_array_equal(img.shape, (300, 420, 3)) File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 707, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/home/jonathan/.virtualenvs/py27/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 600, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not equal (shapes (0,), (3,) mismatch) x: array([], dtype=float64) y: array([300, 420, 3]) ---------------------------------------------------------------------- Ran 2446 tests in 24.397s FAILED (KNOWNFAIL=7, SKIP=7, errors=10, failures=4) Out[2]: -- There are no passengers on Spaceship Earth. We are all crew. Jonathan Bruck E: jdt.bruck at gmail.com Mob: 0421188951 Bachelor of Engineering (Mechanical, Biomedical), Bachelor of Medical Science -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From amueller at ais.uni-bonn.de Mon Jan 21 18:02:24 2013 From: amueller at ais.uni-bonn.de (Andreas Mueller) Date: Tue, 22 Jan 2013 00:02:24 +0100 Subject: [SciPy-User] ANN: scikit-learn 0.13 released! Message-ID: <50FDC900.9000600@ais.uni-bonn.de> Hi all. I am very happy to announce the release of scikit-learn 0.13. New features in this release include feature hashing for text processing, passive-agressive classifiers, faster random forests and many more. There have also been countless improvements in stability, consistency and usability. Details can be found on the what's new page. Sources and windows binaries are available on sourceforge, through pypi (http://pypi.python.org/pypi/scikit-learn/0.13) or can be installed directly using pip: pip install -U scikit-learn A big "thank you" to all the contributors who made this release possible! In parallel to the release, we started a small survey to get to know our user base a bit more. If you are using scikit-learn, it would be great if you could give us your input. Best, Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Mon Jan 21 18:19:58 2013 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Tue, 22 Jan 2013 00:19:58 +0100 Subject: [SciPy-User] [Numpy-discussion] ANN: scikit-learn 0.13 released! In-Reply-To: <50FDC900.9000600@ais.uni-bonn.de> References: <50FDC900.9000600@ais.uni-bonn.de> Message-ID: Congrats and thanks to Andreas and everyone involved in the release, the website fixes and the online survey setup. I posted Andreas blog post on HN and reddit: - http://news.ycombinator.com/item?id=5094319 - http://www.reddit.com/r/programming/comments/170oty/scikitlearn_013_is_out_machine_learning_in_python/ We might get some user feedback in the comments there as well. From Sam.Cable at kirtland.af.mil Tue Jan 22 11:39:23 2013 From: Sam.Cable at kirtland.af.mil (Cable, Sam B Civ USAF AFMC AFRL/RVBXI) Date: Tue, 22 Jan 2013 09:39:23 -0700 Subject: [SciPy-User] scipy can't find a LAPACK routine Message-ID: <39B5ED61E7BFC24FA8277B6DE92A9A3F04FB8CA7@fkimlki01.enterprise.afmc.ds.af.mil> Hope I'm not making a pest out of myself by reposting this, guys, but I'm getting kind of desperate. Am trying to install scipy. It can't find an object that is supposed to be in LAPACK: ImportError: .../flapack.so : undefined symbol: sgges_ I have followed the FAQ instructions, building ATLAS with the option for the full LAPACK build. I have also found a pre-compiled LAPACK binary and just dropped it in place. Nothing changes the outcome. Running "nm" on liblapack.a shows that it contains sgges_, no doubt about it. And "system_info.py" finds liblapack.a in /usr/local/lib just fine. So I don't know what the problem is. Can anyone tell me if this is critical, and what else might be done? BTW, I am running python 2.7 on CentOS 5.x and my FORTRAN compiler is gfortran. Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From paolo.losi at gmail.com Mon Jan 21 23:51:14 2013 From: paolo.losi at gmail.com (Paolo Losi) Date: Tue, 22 Jan 2013 05:51:14 +0100 Subject: [SciPy-User] [Scikit-learn-general] ANN: scikit-learn 0.13 released! In-Reply-To: <50FDC900.9000600@ais.uni-bonn.de> References: <50FDC900.9000600@ais.uni-bonn.de> Message-ID: Great Work and thank you Andreas! Paolo On Tue, Jan 22, 2013 at 12:02 AM, Andreas Mueller wrote: > Hi all. > I am very happy to announce the release of scikit-learn 0.13. > New features in this release include feature hashing for text processing, > passive-agressive classifiers, faster random forests and many more. > > There have also been countless improvements in stability, consistency and > usability. > > Details can be found on the what's new > page. > > Sources and windows binaries are available on sourceforge, > through pypi (http://pypi.python.org/pypi/scikit-learn/0.13) or > can be installed directly using pip: > > pip install -U scikit-learn > > A big "thank you" to all the contributors who made this release possible! > > In parallel to the release, we started a small surveyto get to know our user base a bit more. > If you are using scikit-learn, it would be great if you could give us your > input. > > Best, > Andy > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Jan 22 14:15:54 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 22 Jan 2013 20:15:54 +0100 Subject: [SciPy-User] scipy can't find a LAPACK routine In-Reply-To: <39B5ED61E7BFC24FA8277B6DE92A9A3F04FB8CA7@fkimlki01.enterprise.afmc.ds.af.mil> References: <39B5ED61E7BFC24FA8277B6DE92A9A3F04FB8CA7@fkimlki01.enterprise.afmc.ds.af.mil> Message-ID: On Tue, Jan 22, 2013 at 5:39 PM, Cable, Sam B Civ USAF AFMC AFRL/RVBXI < Sam.Cable at kirtland.af.mil> wrote: > Hope I?m not making a pest out of myself by reposting this, guys, but I?m > getting kind of desperate. Am trying to install scipy. It can?t find an > object that is supposed to be in LAPACK:**** > > ** ** > > ImportError: .../flapack.so : undefined symbol: sgges_**** > > ** ** > > I have followed the FAQ instructions, building ATLAS with the option for > the full LAPACK build. I have also found a pre-compiled LAPACK binary and > just dropped it in place. Nothing changes the outcome. **** > > ** ** > > Running ?nm? on liblapack.a shows that it contains sgges_, no doubt about > it. And ?system_info.py? finds liblapack.a in /usr/local/lib just fine. > So I don?t know what the problem is.**** > > ** ** > > Can anyone tell me if this is critical, and what else might be done? > Yes, that's a problem. You're getting an import error, meaning you won't be able to use at least part of the linear algebra functions. I don't know exactly what the problem is, most likely still related to your LAPACK being broken or the wrong liblapack.a being picked up at build time. If you can't fix it by building from source, I suggest to either find an RPM somewhere (if scipy is not shipped by CentOS, you could find a usable one at http://rpm.pbone.net/) or use a complete scientific Python distribution like EPD or Anaconda. Ralf ** > > BTW, I am running python 2.7 on CentOS 5.x and my FORTRAN compiler is > gfortran.**** > > ** ** > > Thanks.**** > > ** ** > > ** ** > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.eberspaecher at gmail.com Wed Jan 23 06:58:56 2013 From: alex.eberspaecher at gmail.com (Alexander =?ISO-8859-1?B?RWJlcnNw5GNoZXI=?=) Date: Wed, 23 Jan 2013 12:58:56 +0100 Subject: [SciPy-User] scipy can't find a LAPACK routine In-Reply-To: <39B5ED61E7BFC24FA8277B6DE92A9A3F04FB8CA7@fkimlki01.enterprise.afmc.ds.af.mil> References: <39B5ED61E7BFC24FA8277B6DE92A9A3F04FB8CA7@fkimlki01.enterprise.afmc.ds.af.mil> Message-ID: <20130123125856.4646c443@poetzsch.nat.uni-magdeburg.de> On Tue, 22 Jan 2013 09:39:23 -0700 "Cable, Sam B Civ USAF AFMC AFRL/RVBXI" wrote: [LAPACK Symbols] > BTW, I am running python 2.7 on CentOS 5.x and my FORTRAN compiler is > gfortran. AFAIR, CentOS typically installing the ATLAS libraries to /usr/lib64/atlas/ instead of /usr/lib/ or /usr/lib64/. This path ist most likely not in your LD_LIBRARY_PATH environment variable. Thus, on import, there are missing symbols. Please check that and try fiddling around with that. As a small rant: the same is the case with Scientific Linux (IIRC, you run into the same problems when you install an MPI implementation). This is certainly not user-friendly. I never had similar problems on Debian-based systems. Hope that helps Alex PS: As far as I understand, there is no point in using a pre-compiled ATLAS... From jrocher at enthought.com Wed Jan 23 15:16:06 2013 From: jrocher at enthought.com (Jonathan Rocher) Date: Wed, 23 Jan 2013 14:16:06 -0600 Subject: [SciPy-User] [SCIPY2013] Feedback on mini-symposia themes In-Reply-To: References: Message-ID: Dear community members, [Sorry for the cross-post] We are making progress and building an awesome organization team for the SciPy2013 conference (Scientific Computing with Python) this June 24th-29th in Austin, TX. More on that later. Following my previous email, we have gotten lots of good answers to our survey about the themes the community would like to see at the for the mini-symposia *[1]*. We will leave *this survey open until Feb 7th*. So if you haven't done so, and would like to discuss scientific python tools with peers from the same industry/field, take a second to voice your opinion: http://www.surveygizmo.com/s3/1114631/SciPy-2013-Themes Thanks, The SciPy2013 organizers *[1] These mini-symposia are held to discuss scientific computing applied to a specific scientific domain/industry during a half afternoon after the general conference. Their goal is to promote industry specific libraries and tools, and gather people with similar interests for discussions. For example, the SciPy2012 edition successfully hosted 4 mini-symposia on Astronomy/Astrophysics, Bio-informatics, Meteorology, and Geophysics.* * * On Wed, Jan 9, 2013 at 4:32 PM, Jonathan Rocher wrote: > Dear community members, > > We are working hard to organize the SciPy2013 conference (Scientific > Computing with Python) , > this June 24th-29th in Austin, TX. We would like to probe the community > about the themes you would be interested in contributing to or > participating in for the mini-symposia at SciPy2013. > > These mini-symposia are held to discuss scientific computing applied to a > specific *scientific domain/industry* during a half afternoon after the > general conference. Their goal is to promote industry specific libraries > and tools, and gather people with similar interests for discussions. For > example, the SciPy2012 edition > successfully hosted 4 mini-symposia on Astronomy/Astrophysics, > Bio-informatics, Meteorology, and Geophysics. > > Please join us and voice your opinion to shape the next SciPy conference > at: > > http://www.surveygizmo.com/s3/1114631/SciPy-2013-Themes > > Thanks, > > The Scipy2013 organizers > > -- > Jonathan Rocher, PhD > Scientific software developer > Enthought, Inc. > jrocher at enthought.com > 1-512-536-1057 > http://www.enthought.com > > -- Jonathan Rocher, PhD Scientific software developer Enthought, Inc. jrocher at enthought.com 1-512-536-1057 http://www.enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.eberspaecher at gmail.com Wed Jan 23 16:42:40 2013 From: alex.eberspaecher at gmail.com (=?ISO-8859-15?Q?Alexander_Ebersp=E4cher?=) Date: Wed, 23 Jan 2013 22:42:40 +0100 Subject: [SciPy-User] scipy can't find a LAPACK routine In-Reply-To: <20130123125856.4646c443@poetzsch.nat.uni-magdeburg.de> References: <39B5ED61E7BFC24FA8277B6DE92A9A3F04FB8CA7@fkimlki01.enterprise.afmc.ds.af.mil> <20130123125856.4646c443@poetzsch.nat.uni-magdeburg.de> Message-ID: <51005950.60107@gmail.com> On 01/23/2013 12:58 PM, Alexander Ebersp?cher wrote: > [LAPACK Symbols] >> BTW, I am running python 2.7 on CentOS 5.x and my FORTRAN compiler is >> gfortran. > > AFAIR, CentOS typically installing the ATLAS libraries > to /usr/lib64/atlas/ instead of /usr/lib/ or /usr/lib64/. This path ist > most likely not in your LD_LIBRARY_PATH environment variable. Thus, on > import, there are missing symbols. Sorry for the sloppy spelling above :/ Here's how to append the suggested path to LD_LIBRARY_PATH using bash: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib64/atlas/ Then, start a Python shell and try again. Greetings, Alex From lists at hilboll.de Thu Jan 24 11:59:17 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Thu, 24 Jan 2013 17:59:17 +0100 Subject: [SciPy-User] scoreatpercentile behaviour Message-ID: <51016865.4050901@hilboll.de> I just had a quick look into scipy.stats.scoreatpercentile, and was disappointed to see that it's currently not possible to do the calculation for more than one percentile at a time (``per`` is scalar). So I had a quick look into the sources, and was surprised to see that apprently, the function expects ordered input ``a``, which is not noted in the docstring. (Or maybe it's just my misunderstanding of the word 'percentile'. I had expected the function to work on the input's **values**, not on the indices. Is this a bug or a feature? If it's a feature, this should be very explicitly noted in the docstring, I think. I'm willing to do so if you can confirm that the current behaviour is actually wanted. In the sources' TODO, it's stated that a more general percentile implementation would be welcome. I might be able to contribute something here; any hints on where to start? Andreas. From josef.pktd at gmail.com Thu Jan 24 12:41:09 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 24 Jan 2013 12:41:09 -0500 Subject: [SciPy-User] scoreatpercentile behaviour In-Reply-To: <51016865.4050901@hilboll.de> References: <51016865.4050901@hilboll.de> Message-ID: On Thu, Jan 24, 2013 at 11:59 AM, Andreas Hilboll wrote: > I just had a quick look into scipy.stats.scoreatpercentile, and was > disappointed to see that it's currently not possible to do the > calculation for more than one percentile at a time (``per`` is scalar). > So I had a quick look into the sources, and was surprised to see that > apprently, the function expects ordered input ``a``, which is not noted > in the docstring. (Or maybe it's just my misunderstanding of the word > 'percentile'. I had expected the function to work on the input's > **values**, not on the indices. I'm not sure what you mean here: ``a`` is sorted by the function, and then we take the n*per smallest value (roughly, interpolates). That gives you the quantile value of the input array. > > Is this a bug or a feature? If it's a feature, this should be very > explicitly noted in the docstring, I think. I'm willing to do so if you > can confirm that the current behaviour is actually wanted. > > In the sources' TODO, it's stated that a more general percentile > implementation would be welcome. I might be able to contribute something > here; any hints on where to start? there is a pull request that follows the numpy implementation https://github.com/scipy/scipy/pull/374 stats.mstats has different options stats.mstats.scoreatpercentile and stats.mstats.mquantiles (I also wrote a draft for a fully vectorized version of it.) It's one of those function where I don't like the current implementation much, but don't know what the alternative should be. For example in statsmodels we also use stats.mstats.mquantiles because it has interpolation and an axis option. (So, I'm staying partially on the sidelines on this.) Josef > > Andreas. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From lists at hilboll.de Thu Jan 24 12:46:04 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Thu, 24 Jan 2013 18:46:04 +0100 Subject: [SciPy-User] scoreatpercentile behaviour In-Reply-To: References: <51016865.4050901@hilboll.de> Message-ID: <5101735C.8040405@hilboll.de> Am 24.01.2013 18:41, schrieb josef.pktd at gmail.com: > On Thu, Jan 24, 2013 at 11:59 AM, Andreas Hilboll wrote: >> I just had a quick look into scipy.stats.scoreatpercentile, and was >> disappointed to see that it's currently not possible to do the >> calculation for more than one percentile at a time (``per`` is scalar). >> So I had a quick look into the sources, and was surprised to see that >> apprently, the function expects ordered input ``a``, which is not noted >> in the docstring. (Or maybe it's just my misunderstanding of the word >> 'percentile'. I had expected the function to work on the input's >> **values**, not on the indices. > > I'm not sure what you mean here: > ``a`` is sorted by the function, and then we take the n*per smallest > value (roughly, interpolates). > That gives you the quantile value of the input array. > >> >> Is this a bug or a feature? If it's a feature, this should be very >> explicitly noted in the docstring, I think. I'm willing to do so if you >> can confirm that the current behaviour is actually wanted. >> >> In the sources' TODO, it's stated that a more general percentile >> implementation would be welcome. I might be able to contribute something >> here; any hints on where to start? > > there is a pull request that follows the numpy implementation > https://github.com/scipy/scipy/pull/374 > > stats.mstats has different options > stats.mstats.scoreatpercentile and stats.mstats.mquantiles > > (I also wrote a draft for a fully vectorized version of it.) > > It's one of those function where I don't like the current > implementation much, but don't know what the alternative should be. > For example in statsmodels we also use stats.mstats.mquantiles because > it has interpolation and an axis option. > > (So, I'm staying partially on the sidelines on this.) > > Josef Sorry for the noise, it turns out I was just too blind / tired / whatever to notice the very first line of the function values = np.sort(a, axis=0) My dumb fault. Thanks for making me realize. Andreas. From ncholy at gmail.com Thu Jan 24 17:14:53 2013 From: ncholy at gmail.com (Nick Choly) Date: Thu, 24 Jan 2013 17:14:53 -0500 Subject: [SciPy-User] indexing question Message-ID: I'm fairly new to this, so apologies in advance. Let's say I have a 3-d array A[i,j,k] of shape (Ni,Nj,Nk), and a list of labels I[k], where I[k] belongs to 0...Ni-1. I want to create a new, 2-d array B[j,k] = A[I[k], j, k]. I've found one solution to this, which is: ind_i = np.tile(I, (1,Nj)).T ind_j = np.tile(np.arange(Nj), (1,Nk)) ind_k = np.tile(np.arange(Nk), (1, Nj)).T B = A[ind_i, ind_j, ind_k] but something seems...not ideal about this. Can someone tell me if there's a simpler, better way? Thanks, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncholy at gmail.com Thu Jan 24 20:59:08 2013 From: ncholy at gmail.com (Nick C) Date: Thu, 24 Jan 2013 17:59:08 -0800 (PST) Subject: [SciPy-User] array indexing question Message-ID: I'm fairly new to this, so apologies in advance. Let's say I have a 3-d array A[i,j,k] of shape (Ni,Nj,Nk), and a list of labels I[k], where I[k] belongs to 0...Ni-1. I want to create a new, 2-d array B[j,k] = A[I[k], j, k]. I've found one solution to this, which is: ind_i = np.tile(I, (1,Nj)).T ind_j = np.tile(np.arange(Nj), (1,Nk)) ind_k = np.tile(np.arange(Nk), (1, Nj)).T B = A[ind_i, ind_j, ind_k] but something seems...not ideal about this. Can someone tell me if there's a simpler, better way? Thanks, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncholy at gmail.com Thu Jan 24 21:13:15 2013 From: ncholy at gmail.com (Nick C) Date: Thu, 24 Jan 2013 18:13:15 -0800 (PST) Subject: [SciPy-User] array indexing question In-Reply-To: References: Message-ID: <15833e14-1cd3-4466-8c61-483a1375b916@googlegroups.com> Ok, I think I found an improvement, myself; actually, it is a big improvement syntactically, and is perhaps more efficient: B = A[I[np.newaxis, :], np.arange(Nj)[:, np.newaxis], np.arange(Nk)[np.newaxis, :]] but is this the "right" way to do things? -Nick On Thursday, January 24, 2013 8:59:08 PM UTC-5, Nick C wrote: > > I'm fairly new to this, so apologies in advance. > Let's say I have a 3-d array A[i,j,k] of shape (Ni,Nj,Nk), and a list of > labels I[k], where I[k] belongs to 0...Ni-1. > I want to create a new, 2-d array B[j,k] = A[I[k], j, k]. > I've found one solution to this, which is: > ind_i = np.tile(I, (1,Nj)).T > ind_j = np.tile(np.arange(Nj), (1,Nk)) > ind_k = np.tile(np.arange(Nk), (1, Nj)).T > B = A[ind_i, ind_j, ind_k] > > but something seems...not ideal about this. Can someone tell me if > there's a simpler, better way? > Thanks, > Nick > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Fri Jan 25 09:07:52 2013 From: sturla at molden.no (Sturla Molden) Date: Fri, 25 Jan 2013 15:07:52 +0100 Subject: [SciPy-User] scoreatpercentile behaviour In-Reply-To: <5101735C.8040405@hilboll.de> References: <51016865.4050901@hilboll.de> <5101735C.8040405@hilboll.de> Message-ID: <510291B8.7070301@molden.no> On 24.01.2013 18:46, Andreas Hilboll wrote: > Sorry for the noise, it turns out I was just too blind / tired / > whatever to notice the very first line of the function > > values = np.sort(a, axis=0) > > My dumb fault. Thanks for making me realize. I wonder if it's not better to use quickselect than quicksort for percentiles? We can quickselect to find the sample closest to the desired percentile. Then (since quickselect leaves the data partially sorted), we can find the k-nearest neighbors by linear search in each direction, possibly using a heap, leaving us with 2*k+1 samples with which to interpolate the score at the percentile. That should give us the percentiles in average O(n) instead of average O(n log n) time. Sturla From josef.pktd at gmail.com Fri Jan 25 10:17:25 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 25 Jan 2013 10:17:25 -0500 Subject: [SciPy-User] scoreatpercentile behaviour In-Reply-To: <510291B8.7070301@molden.no> References: <51016865.4050901@hilboll.de> <5101735C.8040405@hilboll.de> <510291B8.7070301@molden.no> Message-ID: On Fri, Jan 25, 2013 at 9:07 AM, Sturla Molden wrote: > On 24.01.2013 18:46, Andreas Hilboll wrote: > >> Sorry for the noise, it turns out I was just too blind / tired / >> whatever to notice the very first line of the function >> >> values = np.sort(a, axis=0) >> >> My dumb fault. Thanks for making me realize. > > I wonder if it's not better to use quickselect than quicksort for > percentiles? We can quickselect to find the sample closest to the > desired percentile. Then (since quickselect leaves the data partially > sorted), we can find the k-nearest neighbors by linear search in each > direction, possibly using a heap, leaving us with 2*k+1 samples with > which to interpolate the score at the percentile. That should give us > the percentiles in average O(n) instead of average O(n log n) time. Would this help much if we need the interquartile range, i.e 25 an 75 percentile? (That's the main usecase currently in statsmodels.) quickselect (partial sorting) sounds useful if we just need a few percentiles in the tail close to each other. Josef > > Sturla > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sturla at molden.no Fri Jan 25 11:05:55 2013 From: sturla at molden.no (Sturla Molden) Date: Fri, 25 Jan 2013 17:05:55 +0100 Subject: [SciPy-User] scoreatpercentile behaviour In-Reply-To: References: <51016865.4050901@hilboll.de> <5101735C.8040405@hilboll.de> <510291B8.7070301@molden.no> Message-ID: <5102AD63.70806@molden.no> On 25.01.2013 16:17, josef.pktd at gmail.com wrote: > Would this help much if we need the interquartile range, i.e 25 an 75 > percentile? > (That's the main usecase currently in statsmodels.) Yes, it would still scale average O(n) with the size of the data. The interquartile range scales average O(n log n) if we use full sorting. But keep in mind that O(n) can be slower than O(n log n). Sturla From pmhobson at gmail.com Fri Jan 25 14:55:35 2013 From: pmhobson at gmail.com (Paul Hobson) Date: Fri, 25 Jan 2013 11:55:35 -0800 Subject: [SciPy-User] indexing question In-Reply-To: References: Message-ID: On Thu, Jan 24, 2013 at 2:14 PM, Nick Choly wrote: > I'm fairly new to this, so apologies in advance. > Let's say I have a 3-d array A[i,j,k] of shape (Ni,Nj,Nk), and a list of > labels I[k], where I[k] belongs to 0...Ni-1. > I want to create a new, 2-d array B[j,k] = A[I[k], j, k]. > I've found one solution to this, which is: > ind_i = np.tile(I, (1,Nj)).T > ind_j = np.tile(np.arange(Nj), (1,Nk)) > ind_k = np.tile(np.arange(Nk), (1, Nj)).T > B = A[ind_i, ind_j, ind_k] > > but something seems...not ideal about this. Can someone tell me if > there's a simpler, better way? > Thanks, > Nick > You can use a colon to grab all the elements along a dimension, e.g., In [48]: A = np.random.random_integers(0, 5, size=(3,4,4)) In [49]: A Out[49]: array([[[5, 3, 5, 5], [2, 4, 0, 5], [4, 3, 4, 5], [1, 4, 0, 1]], [[1, 1, 5, 1], [5, 5, 4, 2], [1, 2, 0, 2], [1, 0, 3, 5]], [[1, 4, 4, 4], [0, 4, 3, 4], [2, 2, 3, 1], [0, 1, 0, 0]]]) In [51]: A[1, :, :] Out[51]: array([[1, 1, 5, 1], [5, 5, 4, 2], [1, 2, 0, 2], [1, 0, 3, 5]]) In [52]: A[1, 0:2, 1:] Out[52]: array([[1, 5, 1], [5, 4, 2]]) Does that help? -paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From fgnu32 at yahoo.com Sat Jan 26 07:31:09 2013 From: fgnu32 at yahoo.com (Fg Nu) Date: Sat, 26 Jan 2013 04:31:09 -0800 (PST) Subject: [SciPy-User] Eigenvalues of masked array Message-ID: <1359203469.65474.YahooMailNeo@web160105.mail.bf1.yahoo.com> How can I compute the eigenvalues and eigenvectors of a masked NumPy array? For an unmasked array, this can be achieved by?scipy.linalg.eig. Thanks From ralf.gommers at gmail.com Sat Jan 26 12:25:22 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 26 Jan 2013 18:25:22 +0100 Subject: [SciPy-User] Eigenvalues of masked array In-Reply-To: <1359203469.65474.YahooMailNeo@web160105.mail.bf1.yahoo.com> References: <1359203469.65474.YahooMailNeo@web160105.mail.bf1.yahoo.com> Message-ID: On Sat, Jan 26, 2013 at 1:31 PM, Fg Nu wrote: > How can I compute the eigenvalues and eigenvectors of a masked NumPy > array? For an unmasked array, this can be achieved by scipy.linalg.eig. > Not sure what that even means. You need a full array for calculating eigenvalues/vectors. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Jan 26 12:53:57 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 26 Jan 2013 12:53:57 -0500 Subject: [SciPy-User] Eigenvalues of masked array In-Reply-To: References: <1359203469.65474.YahooMailNeo@web160105.mail.bf1.yahoo.com> Message-ID: On Sat, Jan 26, 2013 at 12:25 PM, Ralf Gommers wrote: > > > > On Sat, Jan 26, 2013 at 1:31 PM, Fg Nu wrote: >> >> How can I compute the eigenvalues and eigenvectors of a masked NumPy >> array? For an unmasked array, this can be achieved by scipy.linalg.eig. > > > Not sure what that even means. You need a full array for calculating > eigenvalues/vectors. The only way I can see is removing columns or rows with any masked values in it. Josef > > Ralf > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From e.antero.tammi at gmail.com Sat Jan 26 13:58:35 2013 From: e.antero.tammi at gmail.com (eat) Date: Sat, 26 Jan 2013 20:58:35 +0200 Subject: [SciPy-User] Eigenvalues of masked array In-Reply-To: <1359203469.65474.YahooMailNeo@web160105.mail.bf1.yahoo.com> References: <1359203469.65474.YahooMailNeo@web160105.mail.bf1.yahoo.com> Message-ID: Hi, On Sat, Jan 26, 2013 at 2:31 PM, Fg Nu wrote: > How can I compute the eigenvalues and eigenvectors of a masked NumPy > array? For an unmasked array, this can be achieved by scipy.linalg.eig. > If the proportion of masked elements is low, you may like to substitute the masked ones with some imputation technique ( http://en.wikipedia.org/wiki/Imputation_(statistics)) and then use scipy.linalg.eig. My 2 cents, -eat > > > Thanks > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan.stowell at eecs.qmul.ac.uk Tue Jan 29 06:24:31 2013 From: dan.stowell at eecs.qmul.ac.uk (Dan Stowell) Date: Tue, 29 Jan 2013 11:24:31 +0000 Subject: [SciPy-User] Fwd: IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events - deadline 31st March In-Reply-To: <5107AB15.9000205@eecs.qmul.ac.uk> References: <5107AB15.9000205@eecs.qmul.ac.uk> Message-ID: <5107B16F.8010601@eecs.qmul.ac.uk> Dear all, We invite researchers in signal processing, machine learning and other fields to participate in our challenge - the IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events. Now available: * Public datasets (CC-licensed, audio and annotations) for scene classification and event detection * Task specifications for our first two tasks, scene classification and event detection (office live) * A Linux virtual machine, which you may use to test your code * Templates for challenge extended abstracts The deadline for submissions is 31st March. Results will be presented/discussed in a special session at WASPAA 2013. Full details: Please feel free to ask us questions directly or via the challenge mailing list . Best wishes, The organisers -- Dimitrios Giannoulis (QMUL), Emmanouil Benetos (CityU/QMUL), Dan Stowell (QMUL), Mathias Rossignol (IRCAM), Mathieu Lagrange (IRCAM) and Mark D. Plumbley (QMUL) From devfreedom at gmail.com Mon Jan 28 22:47:21 2013 From: devfreedom at gmail.com (ICool) Date: Mon, 28 Jan 2013 19:47:21 -0800 Subject: [SciPy-User] scipy 64bit cannot handle very large matrix (integer overflow) Message-ID: Hi All, I am using arpack package for a eigen problem. The matrix is very sparse but also very big. The dimension is even out of the UINT32 range. I use shift-invert to find the minimal eigenvals and associated eigenvalues. The code crashes and returns the follow error message: I found that '1434729898' is a UINT32 number overflowed from '5729697194'. Looking at the code, it seems that the error is from the Fortran function 'gmresrevcom', where it is still using 32-bit integer as the input parameter. I am using 64bit python. Why is the Fortran function 32bit? Will it be a general issue for other Fortran functions? Is there anyway to fix it? 0-th dimension must be fixed to 1434729898 but got 5729697194 Traceback (most recent call last): File "BigMatrixEigen.py", line 108, in main() File "BigMatrixEigen.py", line 97, in main vals, vecs = eigsh(**sigsh_params) File "E:\code\lib\site-packages\scipy\sparse\linalg\eigen\arpack\arpack.py", line 1557, in eigsh params.iterate() File "E:\code\lib\site-packages\scipy\sparse\linalg\eigen\arpack\arpack.py", line 565, in iterate self.workd[yslice] = self.OPa(self.workd[Bxslice]) File "E:\code\lib\site-packages\scipy\sparse\linalg\interface.py", line 123, in matvec y = self._matvec(x) File "E:\code\lib\site-packages\scipy\sparse\linalg\eigen\arpack\arpack.py", line 960, in _matvec b, info = self.ifunc(self.M, x, tol=self.tol) File "", line 2, in gmres File "E:\code\lib\site-packages\scipy\sparse\linalg\isolve\iterative.py", line 82, in non_reentrant return func(*a, **kw) File "E:\code\lib\site-packages\scipy\sparse\linalg\isolve\iterative.py", line 446, in gmres revcom(b, x, restrt, work, work2, iter_, resid, info, ndx1, ndx2, ijob) _iterative.error: failed in converting 4th argument `work' of _iterative.dgmresrevcom to C/Fortran array -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.manders at cranfield.ac.uk Tue Jan 29 11:33:46 2013 From: m.manders at cranfield.ac.uk (Manders, Mark) Date: Tue, 29 Jan 2013 16:33:46 +0000 Subject: [SciPy-User] Installing Polymode Message-ID: I was trying to install polymode into my Python 2.7 (32bit), for some reason the following errors come up. It can't seem to find the numpy 'lapack' libraries even though I have numpy & scipy installed. I also have boost_python installed but it can't find anything from that either. Z:\polymode>setup.py install W: Using scipy hankel ratio - this may be inaccurate! lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in C:\Python27\lib libraries mkl,vml,guide not found in C:\ libraries mkl,vml,guide not found in C:\Python27\libs NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in C:\Python27\lib libraries lapack_atlas not found in C:\Python27\lib libraries ptf77blas,ptcblas,atlas not found in C:\ libraries lapack_atlas not found in C:\ libraries ptf77blas,ptcblas,atlas not found in C:\Python27\libs libraries lapack_atlas not found in C:\Python27\libs numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in C:\Python27\lib libraries lapack_atlas not found in C:\Python27\lib libraries f77blas,cblas,atlas not found in C:\ libraries lapack_atlas not found in C:\ libraries f77blas,cblas,atlas not found in C:\Python27\libs libraries lapack_atlas not found in C:\Python27\libs numpy.distutils.system_info.atlas_info NOT AVAILABLE C:\Python27\lib\site-packages\numpy\distutils\system_info.py:1340: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) lapack_info: libraries lapack not found in C:\Python27\lib libraries lapack not found in C:\ libraries lapack not found in C:\Python27\libs NOT AVAILABLE C:\Python27\lib\site-packages\numpy\distutils\system_info.py:1351: UserWarning: Lapack (http://www.netlib.org/lapack/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [lapack]) or by setting the LAPACK environment variable. warnings.warn(LapackNotFoundError.__doc__) lapack_src_info: NOT AVAILABLE C:\Python27\lib\site-packages\numpy\distutils\system_info.py:1354: UserWarning: Lapack (http://www.netlib.org/lapack/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [lapack_src]) or by setting the LAPACK_SRC environment variable. warnings.warn(LapackSrcNotFoundError.__doc__) NOT AVAILABLE boost_python_info: NOT AVAILABLE Traceback (most recent call last): File "Z:\polymode\setup.py", line 69, in setup_package() File "Z:\polymode\setup.py", line 63, in setup_package cmdclass = {'doc' : generate_api_docs} File "C:\Python27\lib\site-packages\numpy\distutils\core.py", line 152, in set up config = configuration() File "Z:\polymode\setup.py", line 44, in configuration config.add_subpackage(package_name) File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 1002, in add_subpackage caller_level = 2) File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 971, i n get_subpackage caller_level = caller_level + 1) File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 908, i n _get_configuration_from_setup_py config = setup_module.configuration(*args) File "Polymode\setup.py", line 10, in configuration config.add_subpackage('mathlink') File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 1002, in add_subpackage caller_level = 2) File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 971, i n get_subpackage caller_level = caller_level + 1) File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 908, i n _get_configuration_from_setup_py config = setup_module.configuration(*args) File "Polymode\mathlink\setup.py", line 71, in configuration raise NotFoundError,'no lapack/blas resources found' numpy.distutils.system_info.NotFoundError: no lapack/blas resources found -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Jan 29 12:46:14 2013 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 29 Jan 2013 19:46:14 +0200 Subject: [SciPy-User] scipy 64bit cannot handle very large matrix (integer overflow) In-Reply-To: References: Message-ID: 29.01.2013 05:47, ICool kirjoitti: > I am using arpack package for a eigen problem. The matrix is very > sparse but also very big. The dimension is even out of the UINT32 > range. I use shift-invert to find the minimal eigenvals and associated > eigenvalues. The code crashes and returns the follow error message: > > I found that '1434729898' is a UINT32 number overflowed > from '5729697194'. Looking at the code, it seems that the error is from > the Fortran function 'gmresrevcom', where it is still using 32-bit > integer as the input parameter. I am using 64bit python. Why is the > Fortran function 32bit? Will it be a general issue for other Fortran > functions? Is there anyway to fix it? Fortran integers are usually 32-bit. Fixing this issue would then require changes in the Fortran code. -- Pauli Virtanen From pav at iki.fi Tue Jan 29 12:51:46 2013 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 29 Jan 2013 19:51:46 +0200 Subject: [SciPy-User] scipy 64bit cannot handle very large matrix (integer overflow) In-Reply-To: References: Message-ID: 29.01.2013 19:46, Pauli Virtanen kirjoitti: > 29.01.2013 05:47, ICool kirjoitti: >> I am using arpack package for a eigen problem. The matrix is very >> sparse but also very big. The dimension is even out of the UINT32 >> range. I use shift-invert to find the minimal eigenvals and associated >> eigenvalues. The code crashes and returns the follow error message: >> >> I found that '1434729898' is a UINT32 number overflowed >> from '5729697194'. Looking at the code, it seems that the error is from >> the Fortran function 'gmresrevcom', where it is still using 32-bit >> integer as the input parameter. I am using 64bit python. Why is the >> Fortran function 32bit? Will it be a general issue for other Fortran >> functions? Is there anyway to fix it? > > Fortran integers are usually 32-bit. Fixing this issue would then > require changes in the Fortran code. You can file an enhancement request here for this: http://projects.scipy.org/scipy/ However, I'm not promising to work on this in the near future, so unless you want to tackle this yourself, it may take time before someone volunteers to fix it. I'm not fully sure how this is best fixed --- probably needs to use a custom kind for the integer types, and it should be determined on compile time whether the correct width is 4 or 8. -- Pauli Virtanen From osman at fuse.net Tue Jan 29 19:44:41 2013 From: osman at fuse.net (osman buyukisik) Date: Tue, 29 Jan 2013 19:44:41 -0500 Subject: [SciPy-User] scipy 64bit cannot handle very large matrix (integer overflow) In-Reply-To: References: Message-ID: <51086CF9.4050609@fuse.net> On 01/29/2013 12:51 PM, Pauli Virtanen wrote: > 29.01.2013 19:46, Pauli Virtanen kirjoitti: >> 29.01.2013 05:47, ICool kirjoitti: >>> I am using arpack package for a eigen problem. The matrix is very >>> sparse but also very big. The dimension is even out of the UINT32 >>> range. I use shift-invert to find the minimal eigenvals and associated >>> eigenvalues. The code crashes and returns the follow error message: >>> >>> I found that '1434729898' is a UINT32 number overflowed >>> from '5729697194'. Looking at the code, it seems that the error is from >>> the Fortran function 'gmresrevcom', where it is still using 32-bit >>> integer as the input parameter. I am using 64bit python. Why is the >>> Fortran function 32bit? Will it be a general issue for other Fortran >>> functions? Is there anyway to fix it? >> Fortran integers are usually 32-bit. Fixing this issue would then >> require changes in the Fortran code. > You can file an enhancement request here for this: > > http://projects.scipy.org/scipy/ > > However, I'm not promising to work on this in the near future, so unless > you want to tackle this yourself, it may take time before someone > volunteers to fix it. > > I'm not fully sure how this is best fixed --- probably needs to use a > custom kind for the integer types, and it should be determined on > compile time whether the correct width is 4 or 8. > Isn't this usually some kind of compile option like "-i8" ?? From sturla at molden.no Tue Jan 29 23:16:55 2013 From: sturla at molden.no (Sturla Molden) Date: Wed, 30 Jan 2013 05:16:55 +0100 Subject: [SciPy-User] scipy 64bit cannot handle very large matrix (integer overflow) In-Reply-To: <51086CF9.4050609@fuse.net> References: <51086CF9.4050609@fuse.net> Message-ID: <47635D0E-6DB9-41F7-93EE-F03D89457C56@molden.no> Den 30. jan. 2013 kl. 01:44 skrev osman buyukisik : > > Isn't this usually some kind of compile option like "-i8" ?? > Sometimes ? i.e. integers without kind number in the code. Sometimes the kind number is hard-coded. Sometimes the kind number is computed by calling the function "selected_int_kind". On x64, Fortran libraries are usually compled to use 32-bit integers, as it is the native integer size on the platform (64 bit address with 32 bit offset). That is also the reason why a C long on Windows 64 is 32 bits. Sturla From arsenovic at virginia.edu Wed Jan 30 08:29:12 2013 From: arsenovic at virginia.edu (alex arsenovic) Date: Wed, 30 Jan 2013 08:29:12 -0500 Subject: [SciPy-User] ANN: scikit-rf 0.13 Message-ID: Announcement ======================= I would like to announce the release of scikit-rf-0.13! Description ======================= scikit-rf (aka skrf) is an Object Oriented approach to RF/Microwave engineering implemented in the Python programming language. More information and documentation can be found on the website. website : www.scikit-rf.org documentation : http://scikit-rf.org/documentation.html Changes in 0.13 ==================== * read/write support for most skrf objects (through pickle module) * matlab-like annotated smith chart (from gustavocm) * stitch() function to combine network of different frequency bands * more flexible Network constructor (can takes arbitrary properties) * Frequency supports arbitrary frequency vectors * automatic doc-building/uploading via gh-pages.py * virutalInstruments module rename to `vi` * io module added to hold various io capabilities * large improvements in docs, changed to ipython directive -------------- next part -------------- An HTML attachment was scrubbed... URL: From cweisiger at msg.ucsf.edu Wed Jan 30 12:29:01 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Wed, 30 Jan 2013 09:29:01 -0800 Subject: [SciPy-User] Help optimizing an algorithm Message-ID: We have a camera at our lab that has a nonlinear (but monotonic) response to light. I'm attempting to linearize the data output by the camera. I'm doing this by sampling the response curve of the camera, generating a linear fit of the sample, and mapping new data to the linear fit by way of the sample. In other words, we have the following functions: f(x): the response curve of the camera (maps photon intensity to reported counts by the camera) g(x): an approximation of f(x), composed of line segments h(x): a linear fit of g(x) We get a new pixel value Y in -- this is counts reported by the camera. We invert g() to get the approximate photon intensity for that many counts. And then we plug that photon intensity into the linear fit. Right now I believe I have a working algorithm, but it's very slow (which in turn makes testing for validity slow), largely because inverting g() involves iterating over each datapoint in the approximation to find the two that bracket Y so that I can linearly interpolate between them. Having to iterate over every pixel in the image in Python isn't doing me any favors either; we typically deal with 528x512 images so that's 270k iterations per image. If anyone has any suggestions for optimizations I could make, I'd love to hear them. My current algorithm can be seen here: http://pastebin.com/mwaxWHGy -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Jan 30 12:38:26 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 30 Jan 2013 12:38:26 -0500 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: Message-ID: On Wed, Jan 30, 2013 at 12:29 PM, Chris Weisiger wrote: > We have a camera at our lab that has a nonlinear (but monotonic) response to > light. I'm attempting to linearize the data output by the camera. I'm doing > this by sampling the response curve of the camera, generating a linear fit > of the sample, and mapping new data to the linear fit by way of the sample. > In other words, we have the following functions: > > f(x): the response curve of the camera (maps photon intensity to reported > counts by the camera) > g(x): an approximation of f(x), composed of line segments > h(x): a linear fit of g(x) > > We get a new pixel value Y in -- this is counts reported by the camera. We > invert g() to get the approximate photon intensity for that many counts. And > then we plug that photon intensity into the linear fit. > > Right now I believe I have a working algorithm, but it's very slow (which in > turn makes testing for validity slow), largely because inverting g() > involves iterating over each datapoint in the approximation to find the two > that bracket Y so that I can linearly interpolate between them. Having to > iterate over every pixel in the image in Python isn't doing me any favors > either; we typically deal with 528x512 images so that's 270k iterations per > image. > > If anyone has any suggestions for optimizations I could make, I'd love to > hear them. My current algorithm can be seen here: > http://pastebin.com/mwaxWHGy np.searchsorted or scipy.interp1d If g is the same for all pixels, then there is no loop necessary and can be done fully vectorized Josef > > -Chris > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cweisiger at msg.ucsf.edu Wed Jan 30 12:47:37 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Wed, 30 Jan 2013 09:47:37 -0800 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: Message-ID: Right, I should have clarified that g is different for each pixel. It looks like scipy.interpolate.interp1d ought to do exactly what I want, though I'll have to handle the bounds conditions (where the input data is outside the range of the interpolation function that interp1d generates) myself. Thanks for the help! -Chris On Wed, Jan 30, 2013 at 9:38 AM, wrote: > On Wed, Jan 30, 2013 at 12:29 PM, Chris Weisiger > wrote: > > We have a camera at our lab that has a nonlinear (but monotonic) > response to > > light. I'm attempting to linearize the data output by the camera. I'm > doing > > this by sampling the response curve of the camera, generating a linear > fit > > of the sample, and mapping new data to the linear fit by way of the > sample. > > In other words, we have the following functions: > > > > f(x): the response curve of the camera (maps photon intensity to reported > > counts by the camera) > > g(x): an approximation of f(x), composed of line segments > > h(x): a linear fit of g(x) > > > > We get a new pixel value Y in -- this is counts reported by the camera. > We > > invert g() to get the approximate photon intensity for that many counts. > And > > then we plug that photon intensity into the linear fit. > > > > Right now I believe I have a working algorithm, but it's very slow > (which in > > turn makes testing for validity slow), largely because inverting g() > > involves iterating over each datapoint in the approximation to find the > two > > that bracket Y so that I can linearly interpolate between them. Having to > > iterate over every pixel in the image in Python isn't doing me any favors > > either; we typically deal with 528x512 images so that's 270k iterations > per > > image. > > > > If anyone has any suggestions for optimizations I could make, I'd love to > > hear them. My current algorithm can be seen here: > > http://pastebin.com/mwaxWHGy > > > np.searchsorted or scipy.interp1d > > If g is the same for all pixels, then there is no loop necessary and > can be done fully vectorized > > Josef > > > > > -Chris > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Wed Jan 30 12:48:16 2013 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Wed, 30 Jan 2013 10:48:16 -0700 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: Message-ID: > We have a camera at our lab that has a nonlinear (but monotonic) response to light. I'm attempting to linearize the data output by the camera. I'm doing this by sampling the response curve of the camera, generating a linear fit of the sample, and mapping new data to the linear fit by way of the sample. In other words, we have the following functions: > > f(x): the response curve of the camera (maps photon intensity to reported counts by the camera) > g(x): an approximation of f(x), composed of line segments > h(x): a linear fit of g(x) > > We get a new pixel value Y in -- this is counts reported by the camera. We invert g() to get the approximate photon intensity for that many counts. And then we plug that photon intensity into the linear fit. > > Right now I believe I have a working algorithm, but it's very slow (which in turn makes testing for validity slow), largely because inverting g() involves iterating over each datapoint in the approximation to find the two that bracket Y so that I can linearly interpolate between them. Having to iterate over every pixel in the image in Python isn't doing me any favors either; we typically deal with 528x512 images so that's 270k iterations per image. > > If anyone has any suggestions for optimizations I could make, I'd love to hear them. My current algorithm can be seen here: http://pastebin.com/mwaxWHGy Don't have the time to fully spell this out now (in an airport), but the general gist of what you should do is: (1) Make a look-up table mapping input pixel values (as indices) to the desired linearized values. So in a simple example with three different possible incoming pixel values 0, 1, and 2, which should be mapped to values 0, 10, and 200, you would have an array like so: table = numpy.array([0, 10, 200]). To make the input/output relationship between the pixel values and index positions clear, you then have: table[0] = 0 table[1] = 10 table[2] = 200 which is exactly the functional relationship you want. (2) Use fancy-indexing to replace input pixel values with look-up table values. (Read up on the numpy fancy-indexing tutorials that google can point you to.) import numpy table = numpy.array([0, 10, 200]) input = numpy.array([[0, 1, 2, 0], [0, 0, 0, 1], [1, 1, 1, 1]]) output = table[input] print output gives: array([[ 0, 10, 200, 0], [ 0, 0, 0, 10], [ 10, 10, 10, 10]]) Basically, what this is saying is to treat "input" as an array of indices into the table array, and generate an output array that contains the value of the table at each index. It's plenty fast, too (using an example 12-bit camera that produces 2048x2048-pixel images): table = numpy.empty(dtype=numpy.float, shape=(4096,)) input = numpy.empty(dtype=numpy.uint16, shape=(2048,2048)) timeit table[input] # note timeit is from ipython, which you should be using if you aren't. 10 loops, best of 3: 43.3 ms per loop Note that numpy.take does basically the same thing, but there are less special cases, so it's faster: In [14]: timeit numpy.take(table, input) 10 loops, best of 3: 27.4 ms per loop Obviously this only works if your camera produces integer pixel values that can be used as indices into an array. But I don't know of any cameras that don't do this. Zach From zachary.pincus at yale.edu Wed Jan 30 12:57:59 2013 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Wed, 30 Jan 2013 10:57:59 -0700 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: Message-ID: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> If g is different for each pixel, the look-up table approach will probably still work. You'll need a 3D look-up table mapping the function for each pixel: 2x2 case: table = numpy.array([[[0,1,200], [0,1,2]], [[1,2,3], [0,100,200]]]) input = numpy.array([[0,2],[1,1]]) I can't figure out exactly how to do this right now and my flight's boarding, but perhaps some fancy-indexing wizards can help. Also, scipy.ndimage.interpolate could be used in various contexts to do look-up/function interpolation in this context, all in parallel. Zach On Jan 30, 2013, at 10:47 AM, Chris Weisiger wrote: > Right, I should have clarified that g is different for each pixel. It looks like scipy.interpolate.interp1d ought to do exactly what I want, though I'll have to handle the bounds conditions (where the input data is outside the range of the interpolation function that interp1d generates) myself. Thanks for the help! > > -Chris > > > On Wed, Jan 30, 2013 at 9:38 AM, wrote: > On Wed, Jan 30, 2013 at 12:29 PM, Chris Weisiger wrote: > > We have a camera at our lab that has a nonlinear (but monotonic) response to > > light. I'm attempting to linearize the data output by the camera. I'm doing > > this by sampling the response curve of the camera, generating a linear fit > > of the sample, and mapping new data to the linear fit by way of the sample. > > In other words, we have the following functions: > > > > f(x): the response curve of the camera (maps photon intensity to reported > > counts by the camera) > > g(x): an approximation of f(x), composed of line segments > > h(x): a linear fit of g(x) > > > > We get a new pixel value Y in -- this is counts reported by the camera. We > > invert g() to get the approximate photon intensity for that many counts. And > > then we plug that photon intensity into the linear fit. > > > > Right now I believe I have a working algorithm, but it's very slow (which in > > turn makes testing for validity slow), largely because inverting g() > > involves iterating over each datapoint in the approximation to find the two > > that bracket Y so that I can linearly interpolate between them. Having to > > iterate over every pixel in the image in Python isn't doing me any favors > > either; we typically deal with 528x512 images so that's 270k iterations per > > image. > > > > If anyone has any suggestions for optimizations I could make, I'd love to > > hear them. My current algorithm can be seen here: > > http://pastebin.com/mwaxWHGy > > > np.searchsorted or scipy.interp1d > > If g is the same for all pixels, then there is no loop necessary and > can be done fully vectorized > > Josef > > > > > -Chris > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From cweisiger at msg.ucsf.edu Wed Jan 30 13:09:00 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Wed, 30 Jan 2013 10:09:00 -0800 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> Message-ID: Thanks, but considering that in practice we can have values up to 2^16 from this camera, this seems likely to create an approximately 2^16 x 2^9 x 2^9 array, because the camera's outputs are unsigned 16-bit integers. With two bytes per element, that'd mean 32.768 GB of RAM for the lookup table. Of course it'd be much more feasible if the input range can be constrained to not be the full range of data. Another thing I should note is that the nonlinearity in the camera response is fairly localized -- there's one region about 20 counts wide, and another about 500 counts wide, and the rest of the response is basically linear. So outside of those two regions, we can just linearly interpolate and be confident we're getting the right value. -Chris On Wed, Jan 30, 2013 at 9:57 AM, Zachary Pincus wrote: > If g is different for each pixel, the look-up table approach will probably > still work. You'll need a 3D look-up table mapping the function for each > pixel: > > 2x2 case: > table = numpy.array([[[0,1,200], [0,1,2]], [[1,2,3], [0,100,200]]]) > input = numpy.array([[0,2],[1,1]]) > > I can't figure out exactly how to do this right now and my flight's > boarding, but perhaps some fancy-indexing wizards can help. Also, > scipy.ndimage.interpolate could be used in various contexts to do > look-up/function interpolation in this context, all in parallel. > > Zach > > > > On Jan 30, 2013, at 10:47 AM, Chris Weisiger wrote: > > > Right, I should have clarified that g is different for each pixel. It > looks like scipy.interpolate.interp1d ought to do exactly what I want, > though I'll have to handle the bounds conditions (where the input data is > outside the range of the interpolation function that interp1d generates) > myself. Thanks for the help! > > > > -Chris > > > > > > On Wed, Jan 30, 2013 at 9:38 AM, wrote: > > On Wed, Jan 30, 2013 at 12:29 PM, Chris Weisiger > wrote: > > > We have a camera at our lab that has a nonlinear (but monotonic) > response to > > > light. I'm attempting to linearize the data output by the camera. I'm > doing > > > this by sampling the response curve of the camera, generating a linear > fit > > > of the sample, and mapping new data to the linear fit by way of the > sample. > > > In other words, we have the following functions: > > > > > > f(x): the response curve of the camera (maps photon intensity to > reported > > > counts by the camera) > > > g(x): an approximation of f(x), composed of line segments > > > h(x): a linear fit of g(x) > > > > > > We get a new pixel value Y in -- this is counts reported by the > camera. We > > > invert g() to get the approximate photon intensity for that many > counts. And > > > then we plug that photon intensity into the linear fit. > > > > > > Right now I believe I have a working algorithm, but it's very slow > (which in > > > turn makes testing for validity slow), largely because inverting g() > > > involves iterating over each datapoint in the approximation to find > the two > > > that bracket Y so that I can linearly interpolate between them. Having > to > > > iterate over every pixel in the image in Python isn't doing me any > favors > > > either; we typically deal with 528x512 images so that's 270k > iterations per > > > image. > > > > > > If anyone has any suggestions for optimizations I could make, I'd love > to > > > hear them. My current algorithm can be seen here: > > > http://pastebin.com/mwaxWHGy > > > > > > np.searchsorted or scipy.interp1d > > > > If g is the same for all pixels, then there is no loop necessary and > > can be done fully vectorized > > > > Josef > > > > > > > > -Chris > > > > > > _______________________________________________ > > > SciPy-User mailing list > > > SciPy-User at scipy.org > > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Jan 30 13:27:37 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 30 Jan 2013 13:27:37 -0500 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> Message-ID: On Wed, Jan 30, 2013 at 1:09 PM, Chris Weisiger wrote: > Thanks, but considering that in practice we can have values up to 2^16 from > this camera, this seems likely to create an approximately 2^16 x 2^9 x 2^9 > array, because the camera's outputs are unsigned 16-bit integers. With two > bytes per element, that'd mean 32.768 GB of RAM for the lookup table. Of > course it'd be much more feasible if the input range can be constrained to > not be the full range of data. > > Another thing I should note is that the nonlinearity in the camera response > is fairly localized -- there's one region about 20 counts wide, and another > about 500 counts wide, and the rest of the response is basically linear. So > outside of those two regions, we can just linearly interpolate and be > confident we're getting the right value. somwhere in between If the break points of the line segments are at the same location (for the relevant pixels), then the call to np.searchsorted could still be vectorized, and then used as index into the relevant parts of g. However, sometimes using large intermediaries can easily be beaten by a python loop in speed. Josef > > -Chris > > > On Wed, Jan 30, 2013 at 9:57 AM, Zachary Pincus > wrote: >> >> If g is different for each pixel, the look-up table approach will probably >> still work. You'll need a 3D look-up table mapping the function for each >> pixel: >> >> 2x2 case: >> table = numpy.array([[[0,1,200], [0,1,2]], [[1,2,3], [0,100,200]]]) >> input = numpy.array([[0,2],[1,1]]) >> >> I can't figure out exactly how to do this right now and my flight's >> boarding, but perhaps some fancy-indexing wizards can help. Also, >> scipy.ndimage.interpolate could be used in various contexts to do >> look-up/function interpolation in this context, all in parallel. >> >> Zach >> >> >> >> On Jan 30, 2013, at 10:47 AM, Chris Weisiger wrote: >> >> > Right, I should have clarified that g is different for each pixel. It >> > looks like scipy.interpolate.interp1d ought to do exactly what I want, >> > though I'll have to handle the bounds conditions (where the input data is >> > outside the range of the interpolation function that interp1d generates) >> > myself. Thanks for the help! >> > >> > -Chris >> > >> > >> > On Wed, Jan 30, 2013 at 9:38 AM, wrote: >> > On Wed, Jan 30, 2013 at 12:29 PM, Chris Weisiger >> > wrote: >> > > We have a camera at our lab that has a nonlinear (but monotonic) >> > > response to >> > > light. I'm attempting to linearize the data output by the camera. I'm >> > > doing >> > > this by sampling the response curve of the camera, generating a linear >> > > fit >> > > of the sample, and mapping new data to the linear fit by way of the >> > > sample. >> > > In other words, we have the following functions: >> > > >> > > f(x): the response curve of the camera (maps photon intensity to >> > > reported >> > > counts by the camera) >> > > g(x): an approximation of f(x), composed of line segments >> > > h(x): a linear fit of g(x) >> > > >> > > We get a new pixel value Y in -- this is counts reported by the >> > > camera. We >> > > invert g() to get the approximate photon intensity for that many >> > > counts. And >> > > then we plug that photon intensity into the linear fit. >> > > >> > > Right now I believe I have a working algorithm, but it's very slow >> > > (which in >> > > turn makes testing for validity slow), largely because inverting g() >> > > involves iterating over each datapoint in the approximation to find >> > > the two >> > > that bracket Y so that I can linearly interpolate between them. Having >> > > to >> > > iterate over every pixel in the image in Python isn't doing me any >> > > favors >> > > either; we typically deal with 528x512 images so that's 270k >> > > iterations per >> > > image. >> > > >> > > If anyone has any suggestions for optimizations I could make, I'd love >> > > to >> > > hear them. My current algorithm can be seen here: >> > > http://pastebin.com/mwaxWHGy >> > >> > >> > np.searchsorted or scipy.interp1d >> > >> > If g is the same for all pixels, then there is no loop necessary and >> > can be done fully vectorized >> > >> > Josef >> > >> > > >> > > -Chris >> > > >> > > _______________________________________________ >> > > SciPy-User mailing list >> > > SciPy-User at scipy.org >> > > http://mail.scipy.org/mailman/listinfo/scipy-user >> > > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ralf.gommers at gmail.com Wed Jan 30 14:46:33 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 30 Jan 2013 20:46:33 +0100 Subject: [SciPy-User] Installing Polymode In-Reply-To: References: Message-ID: On Tue, Jan 29, 2013 at 5:33 PM, Manders, Mark wrote: > I was trying to install polymode into my Python 2.7 (32bit), for some > reason the following errors come up. It can?t seem to find the numpy > ?lapack? libraries even though I have numpy & scipy installed. I also have > boost_python installed but it can?t find anything from that either.**** > Did you install numpy/scipy from source or with a binary install? If the latter, you probably don't have a separate BLAS/LAPACK on your system. Either way, you need a setup.cfg to point setup.py to whereever your BLAS/LAPACK is on your system. Note that Polymode is provided as Python(x,y) plugin as well. So if you use Python(x,y) that would be the way to go. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkhilmer at chemistry.montana.edu Wed Jan 30 15:37:30 2013 From: jkhilmer at chemistry.montana.edu (jkhilmer at chemistry.montana.edu) Date: Wed, 30 Jan 2013 13:37:30 -0700 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> Message-ID: Chris, Can't you extend Zach's suggestion to pixel categories? Rather than a 2^9 x 2^9 matrix, have a vector of len=3 to describe the different possible types of pixel response curves. Jonathan On Wed, Jan 30, 2013 at 11:09 AM, Chris Weisiger wrote: > Thanks, but considering that in practice we can have values up to 2^16 > from this camera, this seems likely to create an approximately 2^16 x 2^9 x > 2^9 array, because the camera's outputs are unsigned 16-bit integers. With > two bytes per element, that'd mean 32.768 GB of RAM for the lookup table. > Of course it'd be much more feasible if the input range can be constrained > to not be the full range of data. > > Another thing I should note is that the nonlinearity in the camera > response is fairly localized -- there's one region about 20 counts wide, > and another about 500 counts wide, and the rest of the response is > basically linear. So outside of those two regions, we can just linearly > interpolate and be confident we're getting the right value. > > -Chris > > > On Wed, Jan 30, 2013 at 9:57 AM, Zachary Pincus wrote: > >> If g is different for each pixel, the look-up table approach will >> probably still work. You'll need a 3D look-up table mapping the function >> for each pixel: >> >> 2x2 case: >> table = numpy.array([[[0,1,200], [0,1,2]], [[1,2,3], [0,100,200]]]) >> input = numpy.array([[0,2],[1,1]]) >> >> I can't figure out exactly how to do this right now and my flight's >> boarding, but perhaps some fancy-indexing wizards can help. Also, >> scipy.ndimage.interpolate could be used in various contexts to do >> look-up/function interpolation in this context, all in parallel. >> >> Zach >> >> >> >> On Jan 30, 2013, at 10:47 AM, Chris Weisiger wrote: >> >> > Right, I should have clarified that g is different for each pixel. It >> looks like scipy.interpolate.interp1d ought to do exactly what I want, >> though I'll have to handle the bounds conditions (where the input data is >> outside the range of the interpolation function that interp1d generates) >> myself. Thanks for the help! >> > >> > -Chris >> > >> > >> > On Wed, Jan 30, 2013 at 9:38 AM, wrote: >> > On Wed, Jan 30, 2013 at 12:29 PM, Chris Weisiger < >> cweisiger at msg.ucsf.edu> wrote: >> > > We have a camera at our lab that has a nonlinear (but monotonic) >> response to >> > > light. I'm attempting to linearize the data output by the camera. I'm >> doing >> > > this by sampling the response curve of the camera, generating a >> linear fit >> > > of the sample, and mapping new data to the linear fit by way of the >> sample. >> > > In other words, we have the following functions: >> > > >> > > f(x): the response curve of the camera (maps photon intensity to >> reported >> > > counts by the camera) >> > > g(x): an approximation of f(x), composed of line segments >> > > h(x): a linear fit of g(x) >> > > >> > > We get a new pixel value Y in -- this is counts reported by the >> camera. We >> > > invert g() to get the approximate photon intensity for that many >> counts. And >> > > then we plug that photon intensity into the linear fit. >> > > >> > > Right now I believe I have a working algorithm, but it's very slow >> (which in >> > > turn makes testing for validity slow), largely because inverting g() >> > > involves iterating over each datapoint in the approximation to find >> the two >> > > that bracket Y so that I can linearly interpolate between them. >> Having to >> > > iterate over every pixel in the image in Python isn't doing me any >> favors >> > > either; we typically deal with 528x512 images so that's 270k >> iterations per >> > > image. >> > > >> > > If anyone has any suggestions for optimizations I could make, I'd >> love to >> > > hear them. My current algorithm can be seen here: >> > > http://pastebin.com/mwaxWHGy >> > >> > >> > np.searchsorted or scipy.interp1d >> > >> > If g is the same for all pixels, then there is no loop necessary and >> > can be done fully vectorized >> > >> > Josef >> > >> > > >> > > -Chris >> > > >> > > _______________________________________________ >> > > SciPy-User mailing list >> > > SciPy-User at scipy.org >> > > http://mail.scipy.org/mailman/listinfo/scipy-user >> > > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cgohlke at uci.edu Wed Jan 30 17:33:14 2013 From: cgohlke at uci.edu (Christoph Gohlke) Date: Wed, 30 Jan 2013 14:33:14 -0800 Subject: [SciPy-User] Installing Polymode In-Reply-To: References: Message-ID: <51099FAA.3010604@uci.edu> On 1/29/2013 8:33 AM, Manders, Mark wrote: > I was trying to install polymode into my Python 2.7 (32bit), for some > reason the following errors come up. It can?t seem to find the numpy > ?lapack? libraries even though I have numpy & scipy installed. I also > have boost_python installed but it can?t find anything from that either. > Try . It requires Numpy-MKL and Scipy from the same website. Christoph From zachary.pincus at yale.edu Thu Jan 31 00:10:50 2013 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 31 Jan 2013 00:10:50 -0500 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> Message-ID: <060FF2EE-A60D-402A-BFEE-E3EEDFBBF468@yale.edu> > Thanks, but considering that in practice we can have values up to 2^16 from this camera, this seems likely to create an approximately 2^16 x 2^9 x 2^9 array, because the camera's outputs are unsigned 16-bit integers. With two bytes per element, that'd mean 32.768 GB of RAM for the lookup table. Of course it'd be much more feasible if the input range can be constrained to not be the full range of data. > > Another thing I should note is that the nonlinearity in the camera response is fairly localized -- there's one region about 20 counts wide, and another about 500 counts wide, and the rest of the response is basically linear. So outside of those two regions, we can just linearly interpolate and be confident we're getting the right value. Ah, yeah, if you're getting legitimately 16 bits of dynamic range then a per-pixel lookup table scales rather badly! I think that your best bet would be to look at scipy.ndimage.map_coordinates(). It's like the fancy-indexing trick I proposed in that you provide an array of indices that map into an look-up table, except now the indices can be float values and map_coordinates() interpolates the value of the look-up table between positions (interpolation is with spline fits of a specified order, including 1 aka linear). So you'd need a 2^9 x 2^9 x n array for the table, where n is the number of "control points" for the (linear or spline) fit to your transfer functions. Then for each input image, you'd pass in a 3 x 2^9 x 2^9 array of coordinates to map into the input. In the simplest case, the coordinate array would look like this: [x_indices, y_indices, n * input.astype(float) / 2**16)], where (x_indices, y_indices) = numpy.indices((2**9,2**9)). This would be for n control points evenly spaced across the 2^16 range, obviously. You could of course also set the positions of the control points for each transfer function differently for each pixel, which would necessitate a slightly more complex transformation of the original image's values into interpolation positions in the input array, but that's straightforward enough. If this doesn't make clear sense, I'll send proper example code. And if that doesn't run fast enough, you can use OpenGL to do the same thing but with the linear interpolation running on the GPU, using GLSL and a 2D texture sampler (the input image) to build up the coordinates to send to a 3D texture sampler (the lookup table). This is actually a lot less scary than it sounds, and I can give some tips if needed. Zach PS. What sort of camera has a true 16-bit depth and needs a per-pixel linearity correction? Is it an EMCCD? From zachary.pincus at yale.edu Thu Jan 31 00:17:11 2013 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 31 Jan 2013 00:17:11 -0500 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> Message-ID: > Another thing I should note is that the nonlinearity in the camera response is fairly localized -- there's one region about 20 counts wide, and another about 500 counts wide, and the rest of the response is basically linear. So outside of those two regions, we can just linearly interpolate and be confident we're getting the right value. Aah, just noticed this. In that case, then the per-pixel lookup table might still work. Do one simple lookup with a single transfer function as I initially described, and then patch that output just in the error-prone regions with per-pixel lookup results, using much smaller x,y-sized lookup tables (either via fancy indexing, or scipy.ndimage.map_coordinates). Of course, if map_coordinates is fast enough for the whole image (and it should be), then there's no advantage to doing the above as it's a bit less general and more hacky. Zach From opossumnano at gmail.com Thu Jan 31 05:56:28 2013 From: opossumnano at gmail.com (Tiziano Zito) Date: Thu, 31 Jan 2013 11:56:28 +0100 (CET) Subject: [SciPy-User] =?utf-8?q?=5BANN=5D_Summer_School_=22Advanced_Scient?= =?utf-8?q?ific_Programming_in_Python=22_in_Z=C3=BCrich=2C_Switzerland?= Message-ID: <20130131105628.3235612E00D6@comms.bccn-berlin.de> Advanced Scientific Programming in Python ========================================= a Summer School by the G-Node and the Physik-Institut, University of Zurich Scientists spend more and more time writing, maintaining, and debugging software. While techniques for doing this efficiently have evolved, only few scientists actually use them. As a result, instead of doing their research, they spend far too much time writing deficient code and reinventing the wheel. In this course we will present a selection of advanced programming techniques, incorporating theoretical lectures and practical exercises tailored to the needs of a programming scientist. New skills will be tested in a real programming project: we will team up to develop an entertaining scientific computer game. We use the Python programming language for the entire course. Python works as a simple programming language for beginners, but more importantly, it also works great in scientific simulations and data analysis. We show how clean language design, ease of extensibility, and the great wealth of open source libraries for scientific computing and data visualization are driving Python to become a standard tool for the programming scientist. This school is targeted at Master or PhD students and Post-docs from all areas of science. Competence in Python or in another language such as Java, C/C++, MATLAB, or Mathematica is absolutely required. Basic knowledge of Python is assumed. Participants without any prior experience with Python should work through the proposed introductory materials before the course. Date and Location ================= September 1?6, 2013. Z?rich, Switzerlandi. Preliminary Program =================== Day 0 (Sun Sept 1) ? Best Programming Practices - Best Practices, Development Methodologies and the Zen of Python - Version control with git - Object-oriented programming & design patterns Day 1 (Mon Sept 2) ? Software Carpentry - Test-driven development, unit testing & quality assurance - Debugging, profiling and benchmarking techniques - Best practices in data visualization - Programming in teams Day 2 (Tue Sept 3) ? Scientific Tools for Python - Advanced NumPy - The Quest for Speed (intro): Interfacing to C with Cython - Advanced Python I: idioms, useful built-in data structures, generators Day 3 (Wed Sept 4) ? The Quest for Speed - Writing parallel applications in Python - Programming project Day 4 (Thu Sept 5) ? Efficient Memory Management - When parallelization does not help: the starving CPUs problem - Advanced Python II: decorators and context managers - Programming project Day 5 (Fri Sept 6) ? Practical Software Development - Programming project - The Pelita Tournament Every evening we will have the tutors' consultation hour : Tutors will answer your questions and give suggestions for your own projects. Applications ============ You can apply on-line at http://python.g-node.org Applications must be submitted before 23:59 CEST, May 1, 2013. Notifications of acceptance will be sent by June 1, 2013. No fee is charged but participants should take care of travel, living, and accommodation expenses. Candidates will be selected on the basis of their profile. Places are limited: acceptance rate is usually around 20%. Prerequisites: You are supposed to know the basics of Python to participate in the lectures. You are encouraged to go through the introductory material available on the website. Faculty ======= - Francesc Alted, Continuum Analytics Inc., USA - Pietro Berkes, Enthought Inc., UK - Valentin Haenel, freelance developer and consultant, Berlin, Germany - Zbigniew J?drzejewski-Szmek, Krasnow Institute, George Mason University, USA - Eilif Muller, Blue Brain Project, ?cole Polytechnique F?d?rale de Lausanne, Switzerland - Emanuele Olivetti, NeuroInformatics Laboratory, Fondazione Bruno Kessler and University of Trento, Italy - Rike-Benjamin Schuppner, Technologit GbR, Germany - Bartosz Tele?czuk, Unit? de Neurosciences Information et Complexit?, CNRS, France - St?fan van der Walt, Applied Mathematics, Stellenbosch University, South Africa - Bastian Venthur, Berlin Institute of Technology and Bernstein Focus Neurotechnology, Germany - Niko Wilbert, TNG Technology Consulting GmbH, Germany - Tiziano Zito, Institute for Theoretical Biology, Humboldt-Universit?t zu Berlin, Germany Organized by Nicola Chiapolini and colleagues of the Physik-Institut, University of Zurich, and by Zbigniew J?drzejewski-Szmek and Tiziano Zito for the German Neuroinformatics Node of the INCF. Website: http://python.g-node.org Contact: python-info at g-node.org From m.manders at cranfield.ac.uk Thu Jan 31 07:45:08 2013 From: m.manders at cranfield.ac.uk (Manders, Mark) Date: Thu, 31 Jan 2013 12:45:08 +0000 Subject: [SciPy-User] Installing Polymode In-Reply-To: <51099FAA.3010604@uci.edu> References: <51099FAA.3010604@uci.edu> Message-ID: Thanks, that worked -----Original Message----- From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of Christoph Gohlke Sent: 30 January 2013 22:33 To: scipy-user at scipy.org Subject: Re: [SciPy-User] Installing Polymode On 1/29/2013 8:33 AM, Manders, Mark wrote: > I was trying to install polymode into my Python 2.7 (32bit), for some > reason the following errors come up. It can't seem to find the numpy > 'lapack' libraries even though I have numpy & scipy installed. I also > have boost_python installed but it can't find anything from that either. > Try . It requires Numpy-MKL and Scipy from the same website. Christoph _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From cweisiger at msg.ucsf.edu Thu Jan 31 11:14:40 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Thu, 31 Jan 2013 08:14:40 -0800 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> Message-ID: I think that map_coordinates sounds promising; I'll definitely give it a look-see. Thanks for the advice! Going to OpenGL probably won't be required. I gave scipy.interpolate.interp1d a shot, incidentally, and it gave me about a 50% speedup (from ~20s to ~11s to correct one 528x512 image). We'd still be talking hours to correct an entire dataset, though. The camera we're talking about here is a CMOS camera. I think our particular one is unusually bad (it was one of the first of its model), but from what I've heard most, if not all, high-dynamic-range CMOS cameras suffer from some degree of nonlinearity. The problem is that they use two different amplifiers, one for low photon counts and the other for high counts. The region where they "hand off" between each other is where the large nonlinearity is visible. The other smaller nonlinear region I mentioned is a very minor effect right at the bottom of the camera's sensitivity. Honestly I don't know that it's worth worrying about much (uncorrected, you'd be off by under half a count in the vast majority of pixels), but I'm writing my code to be generic so it can handle any kind of monotonic nonlinearity. Here's the results of two of the most nonlinear pixels from my first run, incidentally: http://derakon.dyndns.org/~chriswei/temp2/firstAttemptLinearization.png Clearly I have a bug somewhere causing the first few datapoints to be way off, but I think I know what's going on there. The overall approach is sound, anyway. -Chris On Wed, Jan 30, 2013 at 9:17 PM, Zachary Pincus wrote: > > Another thing I should note is that the nonlinearity in the camera > response is fairly localized -- there's one region about 20 counts wide, > and another about 500 counts wide, and the rest of the response is > basically linear. So outside of those two regions, we can just linearly > interpolate and be confident we're getting the right value. > > Aah, just noticed this. In that case, then the per-pixel lookup table > might still work. Do one simple lookup with a single transfer function as I > initially described, and then patch that output just in the error-prone > regions with per-pixel lookup results, using much smaller x,y-sized lookup > tables (either via fancy indexing, or scipy.ndimage.map_coordinates). > > Of course, if map_coordinates is fast enough for the whole image (and it > should be), then there's no advantage to doing the above as it's a bit less > general and more hacky. > > Zach > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Thu Jan 31 12:39:25 2013 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 31 Jan 2013 12:39:25 -0500 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> Message-ID: <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> > I think that map_coordinates sounds promising; I'll definitely give it a look-see. Thanks for the advice! Going to OpenGL probably won't be required. Good luck with that, and let me know if you have any questions on this front. > The camera we're talking about here is a CMOS camera. I think our particular one is unusually bad (it was one of the first of its model), but from what I've heard most, if not all, high-dynamic-range CMOS cameras suffer from some degree of nonlinearity. The problem is that they use two different amplifiers, one for low photon counts and the other for high counts. The region where they "hand off" between each other is where the large nonlinearity is visible. The other smaller nonlinear region I mentioned is a very minor effect right at the bottom of the camera's sensitivity. Honestly I don't know that it's worth worrying about much (uncorrected, you'd be off by under half a count in the vast majority of pixels), but I'm writing my code to be generic so it can handle any kind of monotonic nonlinearity. I presume you've seen this article about some of the sCMOS cameras, but if not: http://www.microscopy-analysis.com/files/jwiley_microscopy/2012_January_Sabharwal.pdf They mention the dual amplifier gain issues, and also point out some potential trouble spots (toward the end in the "unexpected findings" section) with the low-gain amplifier at least for the (unidentified) camera they used. Worth knowing about... > Here's the results of two of the most nonlinear pixels from my first run, incidentally: > http://derakon.dyndns.org/~chriswei/temp2/firstAttemptLinearization.png > > Clearly I have a bug somewhere causing the first few datapoints to be way off, but I think I know what's going on there. The overall approach is sound, anyway. Yeah, that looks decent. Good luck getting the processing working at useful speeds! > -Chris > > > On Wed, Jan 30, 2013 at 9:17 PM, Zachary Pincus wrote: > > Another thing I should note is that the nonlinearity in the camera response is fairly localized -- there's one region about 20 counts wide, and another about 500 counts wide, and the rest of the response is basically linear. So outside of those two regions, we can just linearly interpolate and be confident we're getting the right value. > > Aah, just noticed this. In that case, then the per-pixel lookup table might still work. Do one simple lookup with a single transfer function as I initially described, and then patch that output just in the error-prone regions with per-pixel lookup results, using much smaller x,y-sized lookup tables (either via fancy indexing, or scipy.ndimage.map_coordinates). > > Of course, if map_coordinates is fast enough for the whole image (and it should be), then there's no advantage to doing the above as it's a bit less general and more hacky. > > Zach > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From cweisiger at msg.ucsf.edu Thu Jan 31 12:57:10 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Thu, 31 Jan 2013 09:57:10 -0800 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> Message-ID: On Thu, Jan 31, 2013 at 9:39 AM, Zachary Pincus wrote: > I presume you've seen this article about some of the sCMOS cameras, but if > not: > > http://www.microscopy-analysis.com/files/jwiley_microscopy/2012_January_Sabharwal.pdf > > They mention the dual amplifier gain issues, and also point out some > potential trouble spots (toward the end in the "unexpected findings" > section) with the low-gain amplifier at least for the (unidentified) camera > they used. Worth knowing about... > > I hadn't seen that; shame on me for not doing my due-diligence. However, their plots look significantly worse than ours do, even if they're cherry-picking bad pixels. For comparison, here's our 7 worst (most nonlinear) pixels: http://derakon.dyndns.org/~chriswei/temp2/badPixels.png And here's our sensor-wide average low-end nonlinearity (note that camera baseline is 100 counts): http://derakon.dyndns.org/~chriswei/temp2/lowEndToe.png I don't have a plot of the least-linear low-end pixels handy. Thanks again for the help! map_coordinates is a confusing function, but I have confidence I'll sort it out sooner or later. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From cweisiger at msg.ucsf.edu Thu Jan 31 16:39:30 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Thu, 31 Jan 2013 13:39:30 -0800 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> Message-ID: Okay, still working on the map_coordinates stuff. I don't have uniform spacing for my sampling, but I believe it should be possible to work around that. We have a 3D array of pixel data, where the axes are (sample, y, x), i.e. the first axis is the index of the sample. If I can map into that array and get values out in "sample indices", I can then map sample indices to exposure times rather easily. For example, if my exposure times are (10, 20, 25), and for a given pixel, map_coordinates tells me 1.5, then that means that that pixel's exposure time is 22.5 (halfway between the exposure times with indices 1 and 2). Let's say that my sampling is just 2 4x3 images: >>> a = np.arange(24.0).reshape(2,4,3) >>> a array([[[ 0., 1., 2.], [ 3., 4., 5.], [ 6., 7., 8.], [ 9., 10., 11.]], [[ 12., 13., 14.], [ 15., 16., 17.], [ 18., 19., 20.], [ 21., 22., 23.]]]) and I want to find the proper value for the (0, 0) pixel if its reported value was 6. What I want in this case is actually .5 (i.e. halfway between the two images). This is where I'm getting stuck, unfortunately. I'm missing some conversion or something, I think. Help would be appreciated. -Chris On Thu, Jan 31, 2013 at 9:57 AM, Chris Weisiger wrote: > On Thu, Jan 31, 2013 at 9:39 AM, Zachary Pincus wrote: > >> I presume you've seen this article about some of the sCMOS cameras, but >> if not: >> >> http://www.microscopy-analysis.com/files/jwiley_microscopy/2012_January_Sabharwal.pdf >> >> They mention the dual amplifier gain issues, and also point out some >> potential trouble spots (toward the end in the "unexpected findings" >> section) with the low-gain amplifier at least for the (unidentified) camera >> they used. Worth knowing about... >> >> > I hadn't seen that; shame on me for not doing my due-diligence. However, > their plots look significantly worse than ours do, even if they're > cherry-picking bad pixels. For comparison, here's our 7 worst (most > nonlinear) pixels: > http://derakon.dyndns.org/~chriswei/temp2/badPixels.png > > And here's our sensor-wide average low-end nonlinearity (note that camera > baseline is 100 counts): > http://derakon.dyndns.org/~chriswei/temp2/lowEndToe.png > > I don't have a plot of the least-linear low-end pixels handy. > > Thanks again for the help! map_coordinates is a confusing function, but I > have confidence I'll sort it out sooner or later. > > -Chris > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bgeorge98121 at gmail.com Wed Jan 30 16:08:57 2013 From: bgeorge98121 at gmail.com (Bruno George) Date: Wed, 30 Jan 2013 13:08:57 -0800 Subject: [SciPy-User] Building Arrays from Video Data Message-ID: I'm working with video data acquired with a datalogger instead of a standard video capture card. The standard NTSC video frequency is irrational, its spec is 5MZ * 63/88, and I'm capturing 20x oversampling. I'm having a great deal of trouble making a tidy array for demodulation with the acquired data. Does anyone have any suggestions how to approach building an array that will give me a consistent starting value? I have used zoom from scipy.ndimage.interpolation, and it does job just fine, but finding the starting and end points is giving me fits. Any suggestions will be cheerfully tried. Thanks, Bruno -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Thu Jan 31 19:00:21 2013 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 31 Jan 2013 19:00:21 -0500 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> Message-ID: <5ECF7AB6-5544-47FD-9107-A2ADFD08AB6F@yale.edu> > For example, if my exposure times are (10, 20, 25), and for a given pixel, map_coordinates tells me 1.5, then that means that that pixel's exposure time is 22.5 (halfway between the exposure times with indices 1 and 2). > > Let's say that my sampling is just 2 4x3 images: > > >>> a = np.arange(24.0).reshape(2,4,3) > >>> a > array([[[ 0., 1., 2.], > [ 3., 4., 5.], > [ 6., 7., 8.], > [ 9., 10., 11.]], > > [[ 12., 13., 14.], > [ 15., 16., 17.], > [ 18., 19., 20.], > [ 21., 22., 23.]]]) > > and I want to find the proper value for the (0, 0) pixel if its reported value was 6. What I want in this case is actually .5 (i.e. halfway between the two images). This is where I'm getting stuck, unfortunately. I'm missing some conversion or something, I think. Help would be appreciated. Let's go back a few steps to make sure we're on the same page... You have a series of flat-field images acquired at different exposure times, which together define a per-pixel gain function, right? Then for each new image you want to calculate the "effective exposure time" for the count at a given pixel. Which is to say, the light input. Is this all correct? So for each pixel, you are estimating the gain function f(exposure) -> value from your series of flat-field calibration images. Because it's monotonic, you can invert this to g(value) -> exposure. Then for any given value in an input image, you want to apply function g(). Again, is this all correct? If so then you're almost there. The problem is one that you point out initially: > I don't have uniform spacing for my sampling Without uniform spacing, you can't convert a sample value into an index in the array without doing a search through the elements in the array to figure out where your value will fit. This is why Josef was talking about searchsorted() et al. So you need to first resample your gain function to have uniform sample spacing. Let's do a one-pixel case first: exposures = [0,1,10,20,25] values = [100, 110, 200, 250, 275] input_value = 260 Now, you could just use numpy.interp() to figure out the exposure time that is the linear interpolation from this: output_exposure = numpy.interp(input_value, values, exposures) Except that under the hood this does a linear search through the values array to find the nearest neighbors of input_value, and then does the standard linear interpolation. This is going to be slow to do for every pixel in an image, unless you code it in C or cython. (Which actually wouldn't be that bad.) Instead let's resample the exposures and values to be uniform: num_samples = 10 vmin, vmax = values.min(), values.max() uniform_values = numpy.linspace(vmin, vmax, num_samples) uniform_exposures = numpy.interp(uniform_values, values, exposures) Note that we're still using numpy.interp() here: we still have to do the linear search! No free lunch. But we can do it just once and pre-compute the lookup table for a range of values, and then subsequently just calculate the correct index into it: value_index = (num_samples - 1) * (input_value - vmin) / float(vmax - vmin) Now we can do linear interpolation with map_coordinates(): exposure_estimate = scipy.ndimage.map_coordinates(uniform_exposures, [[value_index]], order=1)[0] # extra packing/unpacking just for scalar case. Or just directly do the linear interpolation directly: fraction, index = numpy.modf(value_index) index = int(index) l, h = uniform_exposures[[index, index + 1]] exposure_estimate = h*fraction + l*(1 - fraction) So you still need to loop through pixel by pixel and do numpy.interp(), but just once to get a uniformly spaced input array. Then you can use that for map_coordinates() as I described earlier. Remember that in the 2D case, you need to not only provide the appropriate value_index, but also the x- and y-indices, again as I described in the previous email. If you are still stuck, I'll write out example code equivalent to the above but for the 2d case. This all clear? I'm happy to explain anything in further detail! Zach From jmjatkins at gmail.com Thu Jan 31 22:39:55 2013 From: jmjatkins at gmail.com (John) Date: Fri, 1 Feb 2013 03:39:55 +0000 (UTC) Subject: [SciPy-User] why is my scipy slow? Message-ID: Hello, I've been using scipy for a few weeks now and for the most part am thoughily enjoying it! However I have been porting code from matlab and have been surprissed by how much slower it is runnning under python. So much so that I suspect I must be doing something wrong. Below is an example. In matlab the doSomething() function takes 6.4ms. In python it taks 78ms, more than 10x slower. Does this seem right? Or am I missing something? I installed the Enthough distribution for Windows. Any advise much appreaciated! First in python: import time import scipy.signal def speedTest(): rep = 1000 tt = time.time() for i in range(rep): doSomething() print (time.time() - tt) / rep def doSomething(): lp = scipy.signal.firwin(16, 0.5); data = scipy.rand(100000) data = scipy.signal.convolve(data, lp) if __name__ == '__main__': speedTest() Now in matlab: function matlabSpeedTest() rep = 1000; tStart=tic; for j=1:rep doSomething(); end tElapsed=toc(tStart)/rep; str = sprintf('time %s', tElapsed); disp(str); end function data = doSomething() lp = fir1(16,0.5); data = rand(100000, 1, 'double'); data = conv(lp, data); end From tdimiduk at physics.harvard.edu Thu Jan 31 22:46:58 2013 From: tdimiduk at physics.harvard.edu (Tom Dimiduk) Date: Thu, 31 Jan 2013 22:46:58 -0500 Subject: [SciPy-User] [ANN] HoloPy 2.0 Message-ID: <510B3AB2.9070301@physics.harvard.edu> I am pleased to announce the release of HoloPy 2.0: http://manoharan.seas.harvard.edu/holopy/ https://launchpad.net/holopy HoloPy is my research group's tool for working with digital holograms and computational light scattering. We are attempting to do for the classic light scattering codes what numpy did for BLAS and LAPACK: take a powerful well tested tool and provide a flexible high level interface that makes it easy to do cool new things. HoloPy provides easy tools to: * Load and visualize images, and associate them with experimental metadata * Reconstruct 3D volumes from digital holograms * Do Scattering Calculations: - Compute Holograms, electric fields, scattered intensity, cross sections, ... - From spheres, clusters of spheres, and arbitrary structures (using DDA) * Make precise measurements by fitting scattering models (based on the above structures) to experimental data. If anyone on this list works with Holograms or Light Scattering, I encourage you to check it out, tell us what you find useful, or things you would like it to do. Potentially of interest to others, we have what I think are some interesting takes on setting up minimization problems and using numpy arrays with physical imaging type data. Tom Dimiduk Manoharan Lab Harvard University From josef.pktd at gmail.com Thu Jan 31 22:49:59 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 31 Jan 2013 22:49:59 -0500 Subject: [SciPy-User] why is my scipy slow? In-Reply-To: References: Message-ID: On Thu, Jan 31, 2013 at 10:39 PM, John wrote: > Hello, > > I've been using scipy for a few weeks now and for the most part am thoughily > enjoying it! However I have been porting code from matlab and have been > surprissed by how much slower it is runnning under python. So much so that I > suspect I must be doing something wrong. Below is an example. In matlab the > doSomething() function takes 6.4ms. In python it taks 78ms, more than 10x > slower. Does this seem right? Or am I missing something? I installed the > Enthough distribution for Windows. Any advise much appreaciated! > > First in python: > > import time > import scipy.signal > > def speedTest(): > rep = 1000 > tt = time.time() > for i in range(rep): > doSomething() > print (time.time() - tt) / rep > > def doSomething(): > lp = scipy.signal.firwin(16, 0.5); > data = scipy.rand(100000) > data = scipy.signal.convolve(data, lp) > > if __name__ == '__main__': > speedTest() > > > > Now in matlab: > > function matlabSpeedTest() > rep = 1000; > tStart=tic; > for j=1:rep > doSomething(); > end > tElapsed=toc(tStart)/rep; > str = sprintf('time %s', tElapsed); > disp(str); > end > > function data = doSomething() > lp = fir1(16,0.5); > data = rand(100000, 1, 'double'); > data = conv(lp, data); > end maybe you want fftconvolve, faster for long arrays Josef > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From warren.weckesser at gmail.com Thu Jan 31 22:55:01 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Thu, 31 Jan 2013 22:55:01 -0500 Subject: [SciPy-User] why is my scipy slow? In-Reply-To: References: Message-ID: On Thu, Jan 31, 2013 at 10:39 PM, John wrote: > Hello, > > I've been using scipy for a few weeks now and for the most part am > thoughily > enjoying it! However I have been porting code from matlab and have been > surprissed by how much slower it is runnning under python. So much so that > I > suspect I must be doing something wrong. Below is an example. In matlab the > doSomething() function takes 6.4ms. In python it taks 78ms, more than 10x > slower. Does this seem right? Or am I missing something? I installed the > Enthough distribution for Windows. Any advise much appreaciated! > > First in python: > > import time > import scipy.signal > > def speedTest(): > rep = 1000 > tt = time.time() > for i in range(rep): > doSomething() > print (time.time() - tt) / rep > > def doSomething(): > lp = scipy.signal.firwin(16, 0.5); > data = scipy.rand(100000) > data = scipy.signal.convolve(data, lp) > > if __name__ == '__main__': > speedTest() > > > > Now in matlab: > > function matlabSpeedTest() > rep = 1000; > tStart=tic; > for j=1:rep > doSomething(); > end > tElapsed=toc(tStart)/rep; > str = sprintf('time %s', tElapsed); > disp(str); > end > > function data = doSomething() > lp = fir1(16,0.5); > data = rand(100000, 1, 'double'); > data = conv(lp, data); > end > > > There are several methods you can use to apply a FIR filter to a signal; scipy.signal.convolve is actually one of the slowest. See http://www.scipy.org/Cookbook/ApplyFIRFilter for a comparison of the methods. Warren > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jake.biesinger at gmail.com Thu Jan 31 23:55:32 2013 From: jake.biesinger at gmail.com (Jacob Biesinger) Date: Thu, 31 Jan 2013 20:55:32 -0800 Subject: [SciPy-User] norm.logpdf fails with float128? Message-ID: Hi! I know float128 support is a bit sketchy. I've noticed the norm.pdf and norm.logpdf functions both choke when given float128 values. The weird thing is the corresponding norm._pdf and norm._logpdf functions seem to work fine: # pdf function fails >>> m = sp.ones(5, dtype=sp.longdouble) >>> norm.pdf(m) ... TypeError: array cannot be safely cast to required type >>> norm.pdf(m, dtype=sp.longdouble) ... TypeError: array cannot be safely cast to required type # but the _pdf function works fine, with appropriate long-double-precision >>> norm._pdf(m*100) array([ 1.3443135e-2172, 1.3443135e-2172, 1.3443135e-2172, 1.3443135e-2172, 1.3443135e-2172], dtype=float128) Is this expected behaviour? Perhaps a problem with the `place` command? If it's not expected, I'm happy to tinker and submit a pull request. -- Jake Biesinger Graduate Student Xie Lab, UC Irvine -------------- next part -------------- An HTML attachment was scrubbed... URL: