There are two usual ways to combine a sequence of arrays into a new array: 1. concatenated along an existing axis 2. stacked along a new axis For 1, we have np.concatenate. For 2, we have np.vstack, np.hstack, np.dstack and np.column_stack. For arrays with arbitrary dimensions, there is the np.array constructor, possibly with transpose to get the result in the correct order. (I've used this last option in the past but haven't been especially happy with it -- it takes some trial and error to get the axis swapping or transpose right for higher dimensional input.) This methods are similar but subtly distinct, and none of them generalize well to n-dimensional input. It seems like the function we are missing is the plain np.stack, which takes the axis to stack along as a keyword argument. The exact desired functionality is clearest to understand by example:
X = [np.random.randn(100, 200) for i in range(10)] stack(X, axis=0).shape (10, 100, 200) stack(X, axis=1).shape (100, 10, 200) stack(X, axis=2).shape (100, 200, 10)
So I'd like to propose this new function for numpy. The desired signature would be simply np.stack(arrays, axis=0). Ideally, the confusing mess of other stacking functions could then be deprecated, though we could probably never remove them. Matthew Rocklin recent wrote an out of core version this for his dask project (part of Blaze), which is what got me thinking about the need for this functionality: https://github.com/ContinuumIO/dask/pull/30 Cheers, Stephan
+1! I could never keep straight which stack function I needed anyway. Wasn't there a proposal a while back for a more generic stacker, like "tetrix" or something that allowed one to piece together tiles of different sizes? Ben Root On Thu, Feb 5, 2015 at 2:06 PM, Stephan Hoyer <shoyer@gmail.com> wrote:
There are two usual ways to combine a sequence of arrays into a new array: 1. concatenated along an existing axis 2. stacked along a new axis
For 1, we have np.concatenate. For 2, we have np.vstack, np.hstack, np.dstack and np.column_stack. For arrays with arbitrary dimensions, there is the np.array constructor, possibly with transpose to get the result in the correct order. (I've used this last option in the past but haven't been especially happy with it -- it takes some trial and error to get the axis swapping or transpose right for higher dimensional input.)
This methods are similar but subtly distinct, and none of them generalize well to n-dimensional input. It seems like the function we are missing is the plain np.stack, which takes the axis to stack along as a keyword argument. The exact desired functionality is clearest to understand by example:
X = [np.random.randn(100, 200) for i in range(10)] stack(X, axis=0).shape (10, 100, 200) stack(X, axis=1).shape (100, 10, 200) stack(X, axis=2).shape (100, 200, 10)
So I'd like to propose this new function for numpy. The desired signature would be simply np.stack(arrays, axis=0). Ideally, the confusing mess of other stacking functions could then be deprecated, though we could probably never remove them.
Matthew Rocklin recent wrote an out of core version this for his dask project (part of Blaze), which is what got me thinking about the need for this functionality: https://github.com/ContinuumIO/dask/pull/30
Cheers, Stephan
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Thu, Feb 5, 2015 at 11:10 AM, Benjamin Root <ben.root@ou.edu> wrote:
+1! I could never keep straight which stack function I needed anyway.
Wasn't there a proposal a while back for a more generic stacker, like "tetrix" or something that allowed one to piece together tiles of different sizes?
Ben Root
On Thu, Feb 5, 2015 at 2:06 PM, Stephan Hoyer <shoyer@gmail.com> wrote:
There are two usual ways to combine a sequence of arrays into a new array: 1. concatenated along an existing axis 2. stacked along a new axis
For 1, we have np.concatenate. For 2, we have np.vstack, np.hstack, np.dstack and np.column_stack. For arrays with arbitrary dimensions, there is the np.array constructor, possibly with transpose to get the result in the correct order. (I've used this last option in the past but haven't been especially happy with it -- it takes some trial and error to get the axis swapping or transpose right for higher dimensional input.)
This methods are similar but subtly distinct, and none of them generalize well to n-dimensional input. It seems like the function we are missing is the plain np.stack, which takes the axis to stack along as a keyword argument. The exact desired functionality is clearest to understand by example:
X = [np.random.randn(100, 200) for i in range(10)] stack(X, axis=0).shape (10, 100, 200) stack(X, axis=1).shape (100, 10, 200) stack(X, axis=2).shape (100, 200, 10)
So I'd like to propose this new function for numpy. The desired signature would be simply np.stack(arrays, axis=0). Ideally, the confusing mess of other stacking functions could then be deprecated, though we could probably never remove them.
Leaving aside error checking, once you have a positive axis, I think this can be implemented in 2 lines of code: sl = (slice(None),)*axis + (np.newaxis,) return np.concatenate(arr[sl] for arr in arrays) I don't have an opinion either way, and I guess if the hstacks and company have a place in numpy, this does as well. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
participants (3)
-
Benjamin Root -
Jaime Fernández del Río -
Stephan Hoyer