is there a C-API function for numpy which implements Python's
multidimensional indexing? Say, I have a 2d-array
PyArrayObject * M;
and an index
how do I extract the i-th row or column M[i,:] respectively M[:,i]?
I am looking for a function which gives again a PyArrayObject * and
which is a view to M (no copied data; the result should be another
PyArrayObject whose data and strides points to the correct memory
portion of M).
I searched the API documentation, Google and mailing lists for quite a
long time but didn't find anything. Can you help me?
I'm trying to do something that at first glance I think should be simple
but I can't quite figure out how to do it. The problem is as follows:
I have a 3D grid Values[Nx, Ny, Nz]
I want to slice Values at a 2D surface in the Z dimension specified by
Z_index[Nx, Ny] and return a 2D slice[Nx, Ny].
It is not as simple as Values[:,:,Z_index].
I tried this:
(4, 5, 6)
>>> slice = values[:,:,coords]
(4, 5, 4, 5)
>>> slice = np.take(values, coords, axis=2)
(4, 5, 4, 5)
Obviously I could create an empty 2D slice and then fill it by using
np.ndenumerate to fill it point by point by selecting values[i, j,
Z_index[i, j]]. This just seems too inefficient and not very pythonic.
I am working on performance parity between numpy scalar/small array and
python array as GSOC mentored By Charles.
Currently I am looking at PyArray_Return, which allocate separate memory
just for scalar return. Unlike python which allocate memory once for
returning result of scalar operations; numpy calls malloc twice once for
the array object itself, and a second time for the array data.
These memory allocations are happening in PyArray_NewFromDescr and
PyArray_Scalar. Stashing both within a single allocation would be more
In, PyArray_Scalar, new struct (PyLongScalarObject) need allocation in case
of scalar arrays. Instead, can we just some how convert/cast PyArrayObject
PyCon 2014 will be just around the corner from where I am, so I decided
to attend. Being lazy (or busy) I haven't submitted any big talk but thinking
to submit few lightning talks (just 5 min and 400 characters abstract limit),
and I think it might be worth letting people know about my little project. I
would really appreciate your sincere feedback (e.g. "not worth it" would be
valuable too). Here is the title/abstract
numpy-vbench -- speed benchmarks for NumPy
http://yarikoptic.github.io/numpy-vbench provides collection of speed
performance benchmarks for NumPy. Benchmarking of multiple
maintenance and current development branches allows not only to timely
react to new performance regressions, but also to compare NumPy
performance across releases. Your contributions would help to
guarantee that your code does not become slower with a new NumPy
btw -- fresh results are here http://yarikoptic.github.io/numpy-vbench/ .
I have tuned benchmarking so it now reflects the best performance across
multiple executions of the whole battery, thus eliminating spurious
variance if estimate is provided from a single point in time. Eventually I
expect many of those curves to become even "cleaner".
Yaroslav O. Halchenko, Ph.D.
Senior Research Associate, Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
I'm excited to announce the v0.2 release of Bokeh, an interactive web
plotting library for Python. The long-term vision for Bokeh is to
Canvas, to Python users who don't need to write any JS or learn
The full blog post announcement is here:
The project website (with interactive gallery) is at:
And the Git repo is:
I am pleased to announce the availability of NumPy 1.8.0. This release is
the culmination of over a years worth of work by the NumPy team and
contains many fixes, enhancements, and new features. Highlights are:
- New, no 2to3, Python 2 and Python 3 are supported by a common code
- New, gufuncs for linear algebra, enabling operations on stacked arrays.
- New, inplace fancy indexing for ufuncs with the ``.at`` method.
- New, ``partition`` function, partial sorting via selection for fast
- New, ``nanmean``, ``nanvar``, and ``nanstd`` functions skipping NaNs.
- New, ``full`` and ``full_like`` functions to create value initialized
- New, ``PyUFunc_RegisterLoopForDescr``, better ufunc support for user
- Numerous performance improvements in many areas.
This release requires Python 2.6, 2.7 or 3.2-3.3, support for Python 2.4
and 2.5 has been dropped. Sources and binaries can be found at
Some 119 people contributed to this release. This is a remarkable increase
and shows that there is still life in this venerable code that had its
beginning in Numeric some 18 years ago. Many thanks to you all.
NumPy 1.8.0 Release Notes
This release supports Python 2.6 -2.7 and 3.2 - 3.3.
* New, no 2to3, Python 2 and Python 3 are supported by a common code base.
* New, gufuncs for linear algebra, enabling operations on stacked arrays.
* New, inplace fancy indexing for ufuncs with the ``.at`` method.
* New, ``partition`` function, partial sorting via selection for fast
* New, ``nanmean``, ``nanvar``, and ``nanstd`` functions skipping NaNs.
* New, ``full`` and ``full_like`` functions to create value initialized
* New, ``PyUFunc_RegisterLoopForDescr``, better ufunc support for user
* Numerous performance improvements in many areas.
Support for Python versions 2.4 and 2.5 has been dropped,
Support for SCons has been removed.
The Datetime64 type remains experimental in this release. In 1.9 there will
probably be some changes to make it more useable.
The diagonal method currently returns a new array and raises a
FutureWarning. In 1.9 it will return a readonly view.
Multiple field selection from a array of structured type currently
returns a new array and raises a FutureWarning. In 1.9 it will return a
The numpy/oldnumeric and numpy/numarray compatibility modules will be
removed in 1.9.
The doc/sphinxext content has been moved into its own github repository,
and is included in numpy as a submodule. See the instructions in
doc/HOWTO_BUILD_DOCS.rst.txt for how to access the content.
.. _numpydoc: https://github.com/numpy/numpydoc
The hash function of numpy.void scalars has been changed. Previously the
pointer to the data was hashed as an integer. Now, the hash function uses
the tuple-hash algorithm to combine the hash functions of the elements of
the scalar, but only if the scalar is read-only.
Numpy has switched its build system to using 'separate compilation' by
default. In previous releases this was supported, but not default. This
should produce the same results as the old system, but if you're trying to
do something complicated like link numpy statically or using an unusual
compiler, then it's possible you will encounter problems. If so, please
file a bug and as a temporary workaround you can re-enable the old build
system by exporting the shell variable NPY_SEPARATE_COMPILATION=0.
For the AdvancedNew iterator the ``oa_ndim`` flag should now be -1 to
that no ``op_axes`` and ``itershape`` are passed in. The ``oa_ndim == 0``
case, now indicates a 0-D iteration and ``op_axes`` being NULL and the old
usage is deprecated. This does not effect the ``NpyIter_New`` or
The functions nanargmin and nanargmax now return np.iinfo['intp'].min for
the index in all-NaN slices. Previously the functions would raise a
for array returns and NaN for scalar returns.
There is a new compile time environment variable
``NPY_RELAXED_STRIDES_CHECKING``. If this variable is set to 1, then
numpy will consider more arrays to be C- or F-contiguous -- for
example, it becomes possible to have a column vector which is
considered both C- and F-contiguous simultaneously. The new definition
is more accurate, allows for faster code that makes fewer unnecessary
copies, and simplifies numpy's code internally. However, it may also
break third-party libraries that make too-strong assumptions about the
stride values of C- and F-contiguous arrays. (It is also currently
known that this breaks Cython code using memoryviews, which will be
fixed in Cython.) THIS WILL BECOME THE DEFAULT IN A FUTURE RELEASE, SO
PLEASE TEST YOUR CODE NOW AGAINST NUMPY BUILT WITH::
NPY_RELAXED_STRIDES_CHECKING=1 python setup.py install
You can check whether NPY_RELAXED_STRIDES_CHECKING is in effect by
np.ones((10, 1), order="C").flags.f_contiguous
This will be ``True`` if relaxed strides checking is enabled, and
``False`` otherwise. The typical problem we've seen so far is C code
that works with C-contiguous arrays, and assumes that the itemsize can
be accessed by looking at the last element in the ``PyArray_STRIDES(arr)``
array. When relaxed strides are in effect, this is not true (and in
fact, it never was true in some corner cases). Instead, use
For more information check the "Internal memory layout of an ndarray"
section in the documentation.
Binary operations with non-arrays as second argument
Binary operations of the form ``<array-or-subclass> * <non-array-subclass>``
where ``<non-array-subclass>`` declares an ``__array_priority__`` higher
that of ``<array-or-subclass>`` will now unconditionally return
*NotImplemented*, giving ``<non-array-subclass>`` a chance to handle the
operation. Previously, `NotImplemented` would only be returned if
``<non-array-subclass>`` actually implemented the reversed operation, and
a (potentially expensive) array conversion of ``<non-array-subclass>`` had
attempted. (`bug <https://github.com/numpy/numpy/issues/3375>`_, `pull
Function `median` used with `overwrite_input` only partially sorts array
If `median` is used with `overwrite_input` option the input array will now
be partially sorted instead of fully sorted.
Fix to financial.npv
The npv function had a bug. Contrary to what the documentation stated, it
summed from indexes ``1`` to ``M`` instead of from ``0`` to ``M - 1``. The
fix changes the returned value. The mirr function called the npv function,
but worked around the problem, so that was also fixed and the return value
of the mirr function remains unchanged.
Runtime warnings when comparing NaN numbers
Comparing ``NaN`` floating point numbers now raises the ``invalid`` runtime
warning. If a ``NaN`` is expected the warning can be ignored using
Support for linear algebra on stacked arrays
The gufunc machinery is now used for np.linalg, allowing operations on
stacked arrays and vectors. For example::
array([[[ 1., 1.],
[ 0., 1.]],
[[ 1., 1.],
[ 0., 1.]]])
array([[[ 1., -1.],
[ 0., 1.]],
[[ 1., -1.],
[ 0., 1.]]])
In place fancy indexing for ufuncs
The function ``at`` has been added to ufunc objects to allow in place
ufuncs with no buffering when fancy indexing is used. For example, the
following will increment the first and second items in the array, and will
increment the third item twice: ``numpy.add.at(arr, [0, 1, 2, 2], 1)``
This is what many have mistakenly thought ``arr[[0, 1, 2, 2]] += 1`` would
but that does not work as the incremented value of ``arr`` is simply
into the third slot in ``arr`` twice, not incremented twice.
New functions `partition` and `argpartition`
New functions to partially sort arrays via a selection algorithm.
A ``partition`` by index ``k`` moves the ``k`` smallest element to the
an array. All elements before ``k`` are then smaller or equal than the
in position ``k`` and all elements following ``k`` are then greater or equal
than the value in position ``k``. The ordering of the values within these
bounds is undefined.
A sequence of indices can be provided to sort all of them into their sorted
position at once iterative partitioning.
This can be used to efficiently obtain order statistics like median or
percentiles of samples.
``partition`` has a linear time complexity of ``O(n)`` while a full sort has
New functions `nanmean`, `nanvar` and `nanstd`
New nan aware statistical functions are added. In these functions the
results are what would be obtained if nan values were ommited from all
New functions `full` and `full_like`
New convenience functions to create arrays filled with a specific value;
complementary to the existing `zeros` and `zeros_like` functions.
IO compatibility with large files
Large NPZ files >2GB can be loaded on 64-bit systems.
Building against OpenBLAS
It is now possible to build numpy against OpenBLAS by editing site.cfg.
Euler's constant is now exposed in numpy as euler_gamma.
New modes for qr
New modes 'complete', 'reduced', and 'raw' have been added to the qr
factorization and the old 'full' and 'economic' modes are deprecated.
The 'reduced' mode replaces the old 'full' mode and is the default as was
the 'full' mode, so backward compatibility can be maintained by not
specifying the mode.
The 'complete' mode returns a full dimensional factorization, which can be
useful for obtaining a basis for the orthogonal complement of the range
space. The 'raw' mode returns arrays that contain the Householder
reflectors and scaling factors that can be used in the future to apply q
without needing to convert to a matrix. The 'economic' mode is simply
deprecated, there isn't much use for it and it isn't any more efficient
than the 'raw' mode.
New `invert` argument to `in1d`
The function `in1d` now accepts a `invert` argument which, when `True`,
causes the returned array to be inverted.
Advanced indexing using `np.newaxis`
It is now possible to use `np.newaxis`/`None` together with index
arrays instead of only in simple indices. This means that
``array[np.newaxis, [0, 1]]`` will now work as expected and select the first
two rows while prepending a new axis to the array.
New ufuncs can now be registered with builtin input types and a custom
output type. Before this change, NumPy wouldn't be able to find the right
ufunc loop function when the ufunc was called from Python, because the ufunc
loop signature matching logic wasn't looking at the output operand type.
Now the correct ufunc loop is found, as long as the user provides an output
argument with the correct output type.
A simple test runner script ``runtests.py`` was added. It also builds Numpy
``setup.py build`` and can be used to run tests easily during development.
IO performance improvements
Performance in reading large files was improved by chunking (see also IO
Performance improvements to `pad`
The `pad` function has a new implementation, greatly improving performance
all inputs except `mode=<function>` (retained for backwards compatibility).
Scaling with dimensionality is dramatically improved for rank >= 4.
Performance improvements to `isnan`, `isinf`, `isfinite` and `byteswap`
`isnan`, `isinf`, `isfinite` and `byteswap` have been improved to take
advantage of compiler builtins to avoid expensive calls to libc.
This improves performance of these operations by about a factor of two on
Performance improvements via SSE2 vectorization
Several functions have been optimized to make use of SSE2 CPU SIMD
* Float32 and float64:
* base math (`add`, `subtract`, `divide`, `multiply`)
This improves performance of these operations up to 4x/2x for
and up to 10x for bool depending on the location of the data in the CPU
The performance gain is greatest for in-place operations.
In order to use the improved functions the SSE2 instruction set must be
at compile time. It is enabled by default on x86_64 systems. On x86_32 with
capable CPU it must be enabled by passing the appropriate flag to the CFLAGS
build variable (-msse2 with gcc).
Performance improvements to `median`
`median` is now implemented in terms of `partition` instead of `sort` which
reduces its time complexity from O(n log(n)) to O(n).
If used with the `overwrite_input` option the array will now only be
sorted instead of fully sorted.
Overrideable operand flags in ufunc C-API
When creating a ufunc, the default ufunc operand flags can be overridden
via the new op_flags attribute of the ufunc object. For example, to set
the operand flag for the first input to read/write:
PyObject \*ufunc = PyUFunc_FromFuncAndData(...);
ufunc->op_flags = NPY_ITER_READWRITE;
This allows a ufunc to perform an operation in place. Also, global nditer
can be overridden via the new iter_flags attribute of the ufunc object.
For example, to set the reduce flag for a ufunc:
ufunc->iter_flags = NPY_ITER_REDUCE_OK;
The function np.take now allows 0-d arrays as indices.
The separate compilation mode is now enabled by default.
Several changes to np.insert and np.delete:
* Previously, negative indices and indices that pointed past the end of
the array were simply ignored. Now, this will raise a Future or
Warning. In the future they will be treated like normal indexing treats
them -- negative indices will wrap around, and out-of-bound indices will
generate an error.
* Previously, boolean indices were treated as if they were integers (always
referring to either the 0th or 1st item in the array). In the future, they
will be treated as masks. In this release, they raise a FutureWarning
warning of this coming change.
* In Numpy 1.7. np.insert already allowed the syntax
`np.insert(arr, 3, [1,2,3])` to insert multiple items at a single
In Numpy 1.8. this is also possible for `np.insert(arr, , [1, 2, 3])`.
Padded regions from np.pad are now correctly rounded, not truncated.
C-API Array Additions
Four new functions have been added to the array C-API.
C-API Ufunc Additions
One new function has been added to the ufunc C-API that allows to register
an inner loop for user types using the descr.
The 'full' and 'economic' modes of qr factorization are deprecated.
The use of non-integer for indices and most integer arguments has been
deprecated. Previously float indices and function arguments such as axes or
shapes were truncated to integers without warning. For example
`arr.reshape(3., -1)` or `arr[0.]` will trigger a deprecation warning in
NumPy 1.8., and in some future version of NumPy they will raise an error.
This release contains work by the following people who contributed at least
one patch to this release. The names are in alphabetical order by first
* Adam Ginsburg +
* Adam Griffiths +
* Alexander Belopolsky +
* Alex Barth +
* Alex Ford +
* Andreas Hilboll +
* Andreas Kloeckner +
* Andreas Schwab +
* Andrew Horton +
* argriffing +
* Arink Verma +
* Bago Amirbekian +
* Bartosz Telenczuk +
* bebert218 +
* Benjamin Root +
* Bill Spotz +
* Bradley M. Froehle
* Carwyn Pelley +
* Charles Harris
* Christian Brueffer +
* Christoph Dann +
* Christoph Gohlke
* Dan Hipschman +
* Daniel +
* Dan Miller +
* daveydave400 +
* David Cournapeau
* David Warde-Farley
* Denis Laxalde
* dmuellner +
* Edward Catmur +
* Egor Zindy +
* Eric Firing
* Eric Fode
* Eric Moore +
* Eric Price +
* Fazlul Shahriar +
* Félix Hartmann +
* Fernando Perez
* Frank B +
* Frank Breitling +
* Guillaume Gay +
* Han Genuit
* HaroldMills +
* hklemm +
* jamestwebber +
* Jason Madden +
* Jay Bourque
* jeromekelleher +
* Jesús Gómez +
* jmozmoz +
* jnothman +
* Johannes Schönberger +
* John Benediktsson +
* John Salvatier +
* John Stechschulte +
* Jonathan Waltman +
* Joon Ro +
* Jos de Kloe +
* Joseph Martinot-Lagarde +
* Josh Warner (Mac) +
* Jostein Bø Fløystad +
* Juan Luis Cano Rodríguez +
* Julian Taylor +
* Julien Phalip +
* K.-Michael Aye +
* Kumar Appaiah +
* Lars Buitinck
* Leon Weber +
* Luis Pedro Coelho
* Marcin Juszkiewicz
* Mark Wiebe
* Marten van Kerkwijk +
* Martin Baeuml +
* Martin Spacek
* Martin Teichmann +
* Matt Davis +
* Matthew Brett
* Maximilian Albert +
* m-d-w +
* Michael Droettboom
* mwtoews +
* Nathaniel J. Smith
* Nicolas Scheffer +
* Nils Werner +
* ochoadavid +
* Ondřej Čertík
* ovillellas +
* Paul Ivanov
* Pauli Virtanen
* Ralf Gommers
* Raul Cota +
* Richard Hattersley +
* Robert Costa +
* Robert Kern
* Rob Ruana +
* Ronan Lamy
* Sandro Tosi
* Sascha Peilicke +
* Sebastian Berg
* Skipper Seabold
* Stefan van der Walt
* Steve +
* Takafumi Arakaki +
* Thomas Robitaille +
* Tomas Tomecek +
* Travis E. Oliphant
* Valentin Haenel
* Vladimir Rutsky +
* Warren Weckesser
* Yaroslav Halchenko
* Yury V. Zaytsev +
A total of 119 people contributed to this release.
People with a "+" by their names contributed a patch for the first time.
Is there a standard way in numpy of getting a char with C-native
integer signedness? I.e.,
boost::is_signed<char>::value ? numpy.byte : numpy.ubyte
but without nonsensical mixing of languages?
import numpy as np
#from accumulator import stat2nd_double
## Just to make this really clear, I'm making a dummy
## class here that overloads +=
class stat2nd_double (object):
def __iadd__ (self, x):
m = np.empty ((2,3), dtype=object)
m[:,:] = stat2nd_double()
m[0,0] += 1.0 <<<< no error here
m += np.ones ((2,3)) <<< but this gives an error
Traceback (most recent call last):
File "test_stat.py", line 13, in <module>
m += np.ones ((2,3))
TypeError: unsupported operand type(s) for +: 'stat2nd_double' and 'float'
I am using numpy with ipython from anaconda and I observe the following
Python 2.7.5 |Anaconda 1.7.0 (64-bit)| (default, Jun 28 2013, 22:10:09)
Type "copyright", "credits" or "license" for more information.
IPython 1.0.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
Using matplotlib backend: Qt4Agg
In : a= np.random.rand(500000,1000)
In : a = a[:10000]
In : c= np.random.rand(500000,1000)
After In I have an extra 3.7 GB of memory used, but this memory is
not released at In. I thought there might be some clever memory
management trick so I executted In but that just added an extra 3.7GB
of memorry without releasing anything.
Is that the right behavior in this case?
If yes then how do you release memorry by slicing away parts of an
array? Can you give me a description of the numpy internals in this case?
Thank you very much for your time,