Mailman 3 November 2011 - NumPy-Discussion

Failure to build numpy 1.6.1
by Mads Ipsen Nov. 8, 2011

Nov. 8, 2011

Hi, I am trying to build numpy-1.6.1 with the following gcc compiler specs: Reading specs from /usr/lib/gcc/x86_64-redhat-linux/3.4.6/specs Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-java-awt=gtk --host=x86_64-redhat-linux Thread model: posix gcc version 3.4.6 20060404 (Red Hat 3.4.6-11) I get the following error (any clues at what goes wrong)? creating build/temp.linux-x86_64-2.7/numpy/core/src/multiarray compile options: '-Inumpy/core/include -Ibuild/src.linux-x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/home/quantum/quantumnotes/qw-control/quantumsource/external-libs/build/include/python2.7 -Ibuild/src.linux-x86_64-2.7/numpy/core/src/multiarray -Ibuild/src.linux-x86_64-2.7/numpy/core/src/umath -c' gcc: numpy/core/src/multiarray/multiarraymodule_onefile.c numpy/core/src/multiarray/descriptor.c: In function `_convert_divisor_to_multiple': numpy/core/src/multiarray/descriptor.c:606: warning: 'q' might be used uninitialized in this function numpy/core/src/multiarray/einsum.c.src: In function `float_sum_of_products_contig_outstride0_one': numpy/core/src/multiarray/einsum.c.src:852: error: unrecognizable insn: (insn:HI 440 228 481 14 /usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/xmmintrin.h:915 (set (reg:SF 148) (vec_select:SF (reg/v:V4SF 67 [ accum_sse ]) (parallel [ (const_int 0 [0x0]) ]))) -1 (insn_list 213 (nil)) (nil)) numpy/core/src/multiarray/einsum.c.src:852: internal compiler error: in extract_insn, at recog.c:2083 Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://bugzilla.redhat.com/bugzilla> for instructions. Preprocessed source stored into /tmp/ccXaPpf8.out file, please attach this to your bugreport. numpy/core/src/multiarray/descriptor.c: In function `_convert_divisor_to_multiple': numpy/core/src/multiarray/descriptor.c:606: warning: 'q' might be used uninitialized in this function numpy/core/src/multiarray/einsum.c.src: In function `float_sum_of_products_contig_outstride0_one': numpy/core/src/multiarray/einsum.c.src:852: error: unrecognizable insn: (insn:HI 440 228 481 14 /usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/xmmintrin.h:915 (set (reg:SF 148) (vec_select:SF (reg/v:V4SF 67 [ accum_sse ]) (parallel [ (const_int 0 [0x0]) ]))) -1 (insn_list 213 (nil)) (nil)) numpy/core/src/multiarray/einsum.c.src:852: internal compiler error: in extract_insn, at recog.c:2083 Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://bugzilla.redhat.com/bugzilla> for instructions. Preprocessed source stored into /tmp/ccXaPpf8.out file, please attach this to your bugreport. error: Command "gcc -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include -Ibuild/src.linux-x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/home/quantum/quantumnotes/qw-control/quantumsource/external-libs/build/include/python2.7 -Ibuild/src.linux-x86_64-2.7/numpy/core/src/multiarray -Ibuild/src.linux-x86_64-2.7/numpy/core/src/umath -c numpy/core/src/multiarray/multiarraymodule_onefile.c -o build/temp.linux-x86_64-2.7/numpy/core/src/multiarray/multiarraymodule_onefile.o" failed with exit status 1 make: *** [/home/quantum/quantumnotes/qw-control/quantumsource/external-libs/src/numpy-1.6.1/make-stamp] Error 1 -- +-----------------------------------------------------+ | Mads Ipsen | +----------------------+------------------------------+ | Gåsebæksvej 7, 4. tv | | | DK-2500 Valby | phone: +45-29716388 | | Denmark | email: mads.ipsen(a)gmail.com | +----------------------+------------------------------+

2 5

in the NA discussion, what can we agree on?
by Nathaniel Smith Nov. 7, 2011

Nov. 7, 2011

Hi again, Okay, here's my attempt at an *uncontroversial* email! Specifically, I think it'll be easier to talk about this NA stuff if we can establish some common ground, and easier for people to follow if the basic points of agreement are laid out in one place. So I'm going to try and summarize just the things that we can agree about. Note that right now I'm *only* talking about what kind of tools we want to give the user -- i.e., what kind of problems we are trying to solve. AFAICT we don't have as much consensus on implementation matters, and anyway it's hard to make implementation decisions without knowing what we're trying to accomplish. 1) I think we have consensus that there are (at least) two different possible ways of thinking about this problem, with somewhat different constituencies. Let's call these two concepts "MISSING data" and "IGNORED data". 2) I also think we have at least a rough consensus on what these concepts mean, and what their supporters want from them: MISSING data: - Conceptually, MISSINGness acts like a property of a datum -- assigning MISSING to a location is like assigning any other value to that location - Ufuncs and other operations must propagate these values by default, and there must be an option to cause them to be ignored - Must be competitive with NaNs in terms of speed and memory usage (or else people will just use NaNs) - Compatibility with R is valuable - To avoid user confusion, ideally it should *not* be possible to 'unmask' a missing value, since this is inconsistent with the "missing value" metaphor (e.g., see Wes's comment about "leaky abstractions") - Possible useful extension: having different classes of missing values (similar to Stata) - Target audience: data analysis with missing data, neuroimaging, econometrics, former R users, ... IGNORED data: - Conceptually, IGNOREDness acts like a property of the array -- toggling a location to be IGNORED is kind of vaguely similar to changing an array's shape - Ufuncs and other operations must ignore these values by default, and there doesn't really need to be a way to propagate them, even as an option (though it probably wouldn't hurt either) - Some memory overhead is inevitable and acceptable - Compatibility with R neither possible nor valuable - Ability to toggle the IGNORED state of a location is critical, and should be as convenient as possible - Possible useful extension: having not just different types of ignored values, but richer ways to combine them -- e.g., the example of combining astronomical images with some kind of associated per-pixel quality scores, where one might want the 'mask' to be not just a boolean IGNORED/not-IGNORED flag, but an integer (perhaps a multi-byte integer) or even a float, and to allow these 'masks' to be combined in some more complex way than just logical_and. - Target audience: anyone who's already doing this kind of thing by hand using a second mask array + boolean indexing, former numpy.ma users, matplotlib, ... 3) And perhaps we can all agree that the biggest *un*resolved question is whether we want to: - emphasize the similarities between these two use cases and build a single interface that can handle both concepts, with some compromises - or, treat these at two mostly-separate features that can each become exactly what the respective constituency wants without compromise -- but with some potential redundancy and extra code. Each approach has advantages and disadvantages. Does that seem like a fair summary? Anything more we can add? Most importantly, anything here that you disagree with? Did I summarize your needs well? Do you have a use case that you feel doesn't fit naturally into either category? [Also, I thought this might make the start of a good wiki page for people to reference during these discussions, but I don't seem to have edit rights. If other people agree, maybe someone could put it up, or give me access? My trac id is njs(a)pobox.com.] Thanks, -- Nathaniel

7 55

Int casting different across platforms
by Matthew Brett Nov. 6, 2011

Nov. 6, 2011

Hi, I noticed this: (Intel Mac): In [2]: np.int32(np.float32(2**31)) Out[2]: -2147483648 (PPC): In [3]: np.int32(np.float32(2**31)) Out[3]: 2147483647 I assume what is happening is that the casting is handing off to the c library, and that behavior of the c library differs on these platforms? Should we expect or hope that this behavior would be the same across platforms? Thanks for any pointers, Matthew

3 5

ANN: scipy 0.10 release candidate 1
by Ralf Gommers Nov. 5, 2011

Nov. 5, 2011

Hi all, I am pleased to announce the availability of the first release release of SciPy 0.10.0. For this release over a 100 tickets and pull requests have been closed, and many new features have been added. Some of the highlights are: - support for Bento as a build system for scipy - generalized and shift-invert eigenvalue problems in sparse.linalg - addition of discrete-time linear systems in the signal module Sources and binaries can be found at http://sourceforge.net/projects/scipy/files/scipy/0.10.0rc1/, release notes are copied below. Please try this release and report problems on the mailing list. Note: one problem with Python 2.5 (syntax) was discovered after tagging the release, it's fixed in the 0.10.x branch already so no need to report that one. Cheers, Ralf ========================== SciPy 0.10.0 Release Notes ========================== .. note:: Scipy 0.10.0 is not released yet! .. contents:: SciPy 0.10.0 is the culmination of 8 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better documentation. There have been a limited number of deprecations and backwards-incompatible changes in this release, which are documented below. All users are encouraged to upgrade to this release, as there are a large number of bug-fixes and optimizations. Moreover, our development attention will now shift to bug-fix releases on the 0.10.x branch, and on adding new features on the development master branch. Release highlights: - Support for Bento as optional build system. - Support for generalized eigenvalue problems, and all shift-invert modes available in ARPACK. This release requires Python 2.4-2.7 or 3.1- and NumPy 1.5 or greater. New features ============ Bento: new optional build system -------------------------------- Scipy can now be built with `Bento <http://cournape.github.com/Bento/>`_. Bento has some nice features like parallel builds and partial rebuilds, that are not possible with the default build system (distutils). For usage instructions see BENTO_BUILD.txt in the scipy top-level directory. Currently Scipy has three build systems, distutils, numscons and bento. Numscons is deprecated and is planned and will likely be removed in the next release. Generalized and shift-invert eigenvalue problems in ``scipy.sparse.linalg`` --------------------------------------------------------------------------- The sparse eigenvalue problem solver functions ``scipy.sparse.eigs/eigh`` now support generalized eigenvalue problems, and all shift-invert modes available in ARPACK. Discrete-Time Linear Systems (``scipy.signal``) ----------------------------------------------- Support for simulating discrete-time linear systems, including ``scipy.signal.dlsim``, ``scipy.signal.dimpulse``, and ``scipy.signal.dstep``, has been added to SciPy. Conversion of linear systems from continuous-time to discrete-time representations is also present via the ``scipy.signal.cont2discrete`` function. Enhancements to ``scipy.signal`` -------------------------------- A Lomb-Scargle periodogram can now be computed with the new function ``scipy.signal.lombscargle``. The forward-backward filter function ``scipy.signal.filtfilt`` can now filter the data in a given axis of an n-dimensional numpy array. (Previously it only handled a 1-dimensional array.) Options have been added to allow more control over how the data is extended before filtering. FIR filter design with ``scipy.signal.firwin2`` now has options to create filters of type III (zero at zero and Nyquist frequencies) and IV (zero at zero frequency). Additional decomposition options (``scipy.linalg``) --------------------------------------------------- A sort keyword has been added to the Schur decomposition routine (``scipy.linalg.schur``) to allow the sorting of eigenvalues in the resultant Schur form. Additional special matrices (``scipy.linalg``) ---------------------------------------------- The functions ``hilbert`` and ``invhilbert`` were added to ``scipy.linalg``. Enhancements to ``scipy.stats`` ------------------------------- * The *one-sided form* of Fisher's exact test is now also implemented in ``stats.fisher_exact``. * The function ``stats.chi2_contingency`` for computing the chi-square test of independence of factors in a contingency table has been added, along with the related utility functions ``stats.contingency.margins`` and ``stats.contingency.expected_freq``. Basic support for Harwell-Boeing file format for sparse matrices ---------------------------------------------------------------- Both read and write are support through a simple function-based API, as well as a more complete API to control number format. The functions may be found in scipy.sparse.io. The following features are supported: * Read and write sparse matrices in the CSC format * Only real, symmetric, assembled matrix are supported (RUA format) Deprecated features =================== ``scipy.maxentropy`` -------------------- The maxentropy module is unmaintained, rarely used and has not been functioning well for several releases. Therefore it has been deprecated for this release, and will be removed for scipy 0.11. Logistic regression in scikits.learn is a good alternative for this functionality. The ``scipy.maxentropy.logsumexp`` function has been moved to ``scipy.misc``. ``scipy.lib.blas`` ------------------ There are similar BLAS wrappers in ``scipy.linalg`` and ``scipy.lib``. These have now been consolidated as ``scipy.linalg.blas``, and ``scipy.lib.blas`` is deprecated. Numscons build system --------------------- The numscons build system is being replaced by Bento, and will be removed in one of the next scipy releases. Backwards-incompatible changes ============================== The deprecated name `invnorm` was removed from ``scipy.stats.distributions``, this distribution is available as `invgauss`. The following deprecated nonlinear solvers from ``scipy.optimize`` have been removed:: - ``broyden_modified`` (bad performance) - ``broyden1_modified`` (bad performance) - ``broyden_generalized`` (equivalent to ``anderson``) - ``anderson2`` (equivalent to ``anderson``) - ``broyden3`` (obsoleted by new limited-memory broyden methods) - ``vackar`` (renamed to ``diagbroyden``) Other changes ============= ``scipy.constants`` has been updated with the CODATA 2010 constants. ``__all__`` dicts have been added to all modules, which has cleaned up the namespaces (particularly useful for interactive work). An API section has been added to the documentation, giving recommended import guidelines and specifying which submodules are public and which aren't. Authors ======= This release contains work by the following people (contributed at least one patch to this release, names in alphabetical order): * Jeff Armstrong + * Matthew Brett * Lars Buitinck + * David Cournapeau * FI$H 2000 + * Michael McNeil Forbes + * Matty G + * Christoph Gohlke * Ralf Gommers * Yaroslav Halchenko * Charles Harris * Thouis (Ray) Jones + * Chris Jordan-Squire + * Robert Kern * Chris Lasher + * Wes McKinney + * Travis Oliphant * Fabian Pedregosa * Josef Perktold * Thomas Robitaille + * Pim Schellart + * Anthony Scopatz + * Skipper Seabold + * Fazlul Shahriar + * David Simcha + * Scott Sinclair + * Andrey Smirnov + * Collin RM Stocks + * Martin Teichmann + * Jake Vanderplas + * Gaël Varoquaux + * Pauli Virtanen * Stefan van der Walt * Warren Weckesser * Mark Wiebe + A total of 35 people contributed to this release. People with a "+" by their names contributed a patch for the first time.

1 0

what is the point of dx for np.gradient()?
by Benjamin Root Nov. 5, 2011

Nov. 5, 2011

For np.gradient(), one can specify a sample distance for each axis to apply to the gradient. But, all this does is just divides the gradient by the sample distance. I could easily do that myself with the output from gradient. Wouldn't it be more valuable to be able to specify the width of the central difference (or is there another function that does that)? Thanks, Ben Root

1 1

Re: [Numpy-discussion] Indexing a masked array with another masked array leads to unexpected results
by Joe Kington Nov. 4, 2011

Nov. 4, 2011

On Fri, Nov 4, 2011 at 5:26 AM, Pierre GM <pgmdevlist(a)gmail.com> wrote: > > On Nov 03, 2011, at 23:07 , Joe Kington wrote: > > > I'm not sure if this is exactly a bug, per se, but it's a very confusing > consequence of the current design of masked arrays… > I would just add a "I think" between the "but" and "it's" before I could > agree. > > > Consider the following example: > > > > import numpy as np > > > > x = np.ma.masked_all(10, dtype=np.float32) > > print x > > x[x > 0] = 5 > > print x > > > > The exact results will vary depending the contents of the empty memory > the array was initialized from. > > Not a surprise. But isn't mentioned in the doc somewhere that using a > masked array as index is a very bad idea ? And that you should always fill > it before you use it as an array ? (Actually, using a MaskedArray as index > used to raise an IndexError. But I thought it was a bit too harsh, so I > dropped it). > Not that I can find in the docs (Perhaps I just missed it?). At any rate, it's not mentioned in the numpy.ma section on indexing: http://docs.scipy.org/doc/numpy/reference/maskedarray.generic.html#indexing… The only mention of it is a comment in MaskedArray.__setitem__ where the IndexError is commented out. > ma.masked_all is an empty array with all its elements masked. Ie, you have > an uninitialized ndarray as data, and a bool array of the same size, full > of True. The operative word is here "uninitialized". > > > This wreaks havoc when filtering the contents of masked arrays (and > leads to hard-to-find bugs!). The mask of the array in question is altered > at random (or, rather, based on the masked values as well as the masked > ones). > > Once again, you're working on an *uninitialized* array. What you should > really do is to initialize it first, e.g. by 0, or whatever would make > sense in your field, and then work from that. > Sure, I shouldn't have used that as the example. My point was that it's counter-intuitive that something like "x[x > 0] = 0" alters the mask of x based on the values of _masked_ elements. How it's initialized is irrelevant (though, of course, it wouldn't be semi-random if it were initialized in another way). > > I can see the reasoning behind the way it works. It makes sense that "x > > 0" returns a masked boolean array with potentially several elements > masked, as well as the unmasked elements greater than 0. > > Well, "x > 0" is also a masked array, with its mask full of True. Not very > usable by itself, and especially *not* for indexing. > > However, wouldn't it make more sense to have MaskedArray.__setitem__ > only operate on the unmasked elements of the "indx" passed in (at least in > the case where the assigned "value" isn't a masked array)? > > > Normally, that should be the case. But you're not working in "normal" > conditions, here. A bit like trying to boil water on a stove with a plastic > pan. > "x[x > threshold] = something" is a very common idiom for ndarrays. I think most people would find it surprising that this operation doesn't ignore the masked values. I noticed this because one of my coworkers was complaining that a piece of my code was "messing up" their masked arrays. I'd never tested it with masked arrays, but it took me ages to find, just because I wasn't looking in places where I was just using common idioms. In this particular case, they'd initialized it with "masked_all", so it effectively altered the mask of the array at random. Regardless of how it was initialized, though, it is surprising that the mask of "x" is changed based on masked values. I just think it would be useful for it to be documented. Cheers, -Joe

1 0

Indexing a masked array with another masked array leads to unexpected results
by Joe Kington Nov. 4, 2011

Nov. 4, 2011

Forgive me if this is already a well-know oddity of masked arrays. I hadn't seen it before, though. I'm not sure if this is exactly a bug, per se, but it's a very confusing consequence of the current design of masked arrays... Consider the following example: import numpy as np x = np.ma.masked_all(10, dtype=np.float32) print x x[x > 0] = 5 print x The exact results will vary depending the contents of the empty memory the array was initialized from. This wreaks havoc when filtering the contents of masked arrays (and leads to hard-to-find bugs!). The mask of the array in question is altered at random (or, rather, based on the masked values as well as the masked ones). Of course, once you're aware of this, there are a number of workarounds (namely, filling the array or explicitly operating on "x.data" instead of x). I can see the reasoning behind the way it works. It makes sense that "x > 0" returns a masked boolean array with potentially several elements masked, as well as the unmasked elements greater than 0. However, wouldn't it make more sense to have MaskedArray.__setitem__ only operate on the unmasked elements of the "indx" passed in (at least in the case where the assigned "value" isn't a masked array)? Cheers, -Joe

2 1

numpy with nose
by akshar bhosale Nov. 3, 2011

Nov. 3, 2011

Hi, i am using mkl 10.1, intel cluster toolkit 11/069, os rhel 5.2 x86_64, python 2.6, processor is intel xeon numpy version is 1.6.0 my numpy.test hanging at below point : Test whether equivalent subarray dtypes hash the same. ... ok Test whether different subarray dtypes hash differently. ... ok Test some data types that are equal ... ok Test some more complicated cases that shouldn't be equal ... ok Test some simple cases that shouldn't be equal ... ok test_single_subarray (test_dtype.TestSubarray) ... ok test_einsum_errors (test_einsum.TestEinSum) ... ok test_einsum_sums_cfloat128 (test_einsum.TestEinSum) ... any pointers for this?

3 5

numpy error with mkl 10.1
by akshar bhosale Nov. 3, 2011

Nov. 3, 2011

Hi, i am getting following error. python -c 'import numpy;numpy.matrix([[1, 5, 10], [1.0, 3j, 4]], numpy.complex128).T.I.H' MKL FATAL ERROR: Cannot load libmkl_lapack.so have installed numpy 1.6.0 with python 2.6. i have intel cluster toolkit installed on my system. (11/069 version and mlk=10.1). i have machine having intel xeon processor and rhel 5.2 x86_64 platform. Kindly help

3 21

ANN: Spyder v2.1
by Pierre Raybaut Nov. 3, 2011

Nov. 3, 2011

Hi all, On the behalf of Spyder's development team (http://code.google.com/p/spyderlib/people/list), I'm pleased to announce that Spyder v2.1 has been released and is available for Windows XP/Vista/7, GNU/Linux and MacOS X: http://code.google.com/p/spyderlib/ Spyder is a free, open-source (MIT license) interactive development environment for the Python language with advanced editing, interactive testing, debugging and introspection features. Originally designed to provide MATLAB-like features (integrated help, interactive console, variable explorer with GUI-based editors for dictionaries, NumPy arrays, ...), it is strongly oriented towards scientific computing and software development. Thanks to the `spyderlib` library, Spyder also provides powerful ready-to-use widgets: embedded Python console (example: http://packages.python.org/guiqwt/_images/sift3.png), NumPy array editor (example: http://packages.python.org/guiqwt/_images/sift2.png), dictionary editor, source code editor, etc. Description of key features with tasty screenshots can be found at: http://code.google.com/p/spyderlib/wiki/Features This release represents a year of development since v2.0 and introduces major enhancements and new features: * Large performance and stability improvements * PySide support (PyQt is no longer exclusively required) * New profiler plugin (thanks to Santiago Jaramillo, a new contributor) * Experimental support for IPython v0.11+ * And many other changes: http://code.google.com/p/spyderlib/wiki/ChangeLog On Windows platforms, Spyder is also available as a stand-alone executable (don't forget to disable UAC on Vista/7). This all-in-one portable version is still experimental (for example, it does not embed sphinx -- meaning no rich text mode for the object inspector) but it should provide a working version of Spyder for Windows platforms without having to install anything else (except Python 2.x itself, of course). Don't forget to follow Spyder updates/news: * on the project website: http://code.google.com/p/spyderlib/ * and on our official blog: http://spyder-ide.blogspot.com/ Last, but not least, we welcome any contribution that helps making Spyder an efficient scientific development/computing environment. Join us to help creating your favourite environment! (http://code.google.com/p/spyderlib/wiki/NoteForContributors) Enjoy! -Pierre

2 1