[Numpy-discussion] New function: np.stack?

Stephan Hoyer shoyer at gmail.com
Thu Feb 5 14:06:17 EST 2015


There are two usual ways to combine a sequence of arrays into a new array:
1. concatenated along an existing axis
2. stacked along a new axis

For 1, we have np.concatenate. For 2, we have np.vstack, np.hstack,
np.dstack and np.column_stack. For arrays with arbitrary dimensions, there
is the np.array constructor, possibly with transpose to get the result in
the correct order. (I've used this last option in the past but haven't been
especially happy with it -- it takes some trial and error to get the axis
swapping or transpose right for higher dimensional input.)

This methods are similar but subtly distinct, and none of them generalize
well to n-dimensional input. It seems like the function we are missing is
the plain np.stack, which takes the axis to stack along as a keyword
argument. The exact desired functionality is clearest to understand by
example:

>>> X = [np.random.randn(100, 200) for i in range(10)]
>>> stack(X, axis=0).shape
(10, 100, 200)
>>> stack(X, axis=1).shape
(100, 10, 200)
>>> stack(X, axis=2).shape
(100, 200, 10)

So I'd like to propose this new function for numpy. The desired signature
would be simply np.stack(arrays, axis=0). Ideally, the confusing mess of
other stacking functions could then be deprecated, though we could probably
never remove them.

Matthew Rocklin recent wrote an out of core version this for his dask
project (part of Blaze), which is what got me thinking about the need for
this functionality:
https://github.com/ContinuumIO/dask/pull/30

Cheers,
Stephan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150205/8ce96a96/attachment.html>


More information about the NumPy-Discussion mailing list