numpy.stack -- which function, if any, deserves the name?
In the past months there have been two proposals for new numpy functions using the name "stack": 1. np.stack for stacking like np.asarray(np.bmat(...)) http://thread.gmane.org/gmane.comp.python.numeric.general/58748/ https://github.com/numpy/numpy/pull/5057 2. np.stack for stacking along an arbitrary new axis (this was my proposal) http://thread.gmane.org/gmane.comp.python.numeric.general/59850/ https://github.com/numpy/numpy/pull/5605 Both functions generalize the notion of stacking arrays from the existing hstack, vstack and dstack, but in two very different ways. Both could be useful -- but we can only call one "stack". Which one deserves that name? The existing *stack functions use the word "stack" to refer to combining arrays in two similarly different ways: a. For ND -> ND stacking along an existing dimensions (like numpy.concatenate and proposal 1) b. For ND -> (N+1)D stacking along new dimensions (like proposal 2). I think it would be much cleaner API design if we had different words to denote these two different operations. Concatenate for "combine along an existing dimension" already exists, so my thought (when I wrote proposal 2), was that the verb "stack" could be reserved (going forward) for "combine along a new dimension." This also has the advantage of suggesting that "concatenate" and "stack" are the two fundamental operations for combining N-dimensional arrays. The documentation on this is currently quite confusing, mostly because no function like that in proposal 2 currently exists. Of course, the *stack functions have existed for quite some time, and in many cases vstack and hstack are indeed used for concatenate like functionality (e.g., whenever they are used for 2D arrays/matrices). So the case is not entirely clear-cut. (We'll never be able to remove this functionality from NumPy.) In any case, I would appreciate your thoughts. Best, Stephan
Hey,
1. np.stack for stacking like np.asarray(np.bmat(...)) http://thread.gmane.org/gmane.comp.python.numeric.general/58748/ https://github.com/numpy/numpy/pull/5057
I'm the author of this proposal. I'll just give some context real quickly. "My stack" started really simple, basically allowing a Matlab-like notation for stacking: matlab: [ a b; c d ] numpy: stack([[a, b], [c, d]]) or even stack([a, b], [c, d]) where a, b, c, and d a arrays. During the discussion people asked for fancier stacking and auto filling of non explicitly set blocks (think of an "eye" matrix where only certain blocks are set). Alternatively, we thought of refactoring the core of bmat [2] so that it can be used with arrays and matrices. This would allow stack("a b; c d") where a, b, c, and d are the names of arrays/matrices. (Also bmat would get better documentation during the refactoring :)). Summarizing, my proposal is mostly concerned how to create block arrays from given arrays. I don't care about the name "stack". I just used "stack" because it replaced hstack/vstack for me. Maybe "bstack" for block stack, or "barray" for block array? I have the feeling [1] that my use case is more common, but I like the second proposal. Cheers, Stefan [1] Everybody generalizes from oneself. At least I do. [2] http://docs.scipy.org/doc/numpy/reference/generated/numpy.bmat.html
We already use the word "stack" in lots of function names to refer to something different from what bmat does. So while I definitely agree we should have something like bmat for ndarrays, it would be better all the to just pick a different name. np.block, even, might do the job. On Mar 16, 2015 1:50 AM, "Stefan Otte" <stefan.otte@gmail.com> wrote:
Hey,
1. np.stack for stacking like np.asarray(np.bmat(...)) http://thread.gmane.org/gmane.comp.python.numeric.general/58748/ https://github.com/numpy/numpy/pull/5057
I'm the author of this proposal. I'll just give some context real quickly.
"My stack" started really simple, basically allowing a Matlab-like notation for stacking:
matlab: [ a b; c d ] numpy: stack([[a, b], [c, d]]) or even stack([a, b], [c, d])
where a, b, c, and d a arrays.
During the discussion people asked for fancier stacking and auto filling of non explicitly set blocks (think of an "eye" matrix where only certain blocks are set).
Alternatively, we thought of refactoring the core of bmat [2] so that it can be used with arrays and matrices. This would allow stack("a b; c d") where a, b, c, and d are the names of arrays/matrices. (Also bmat would get better documentation during the refactoring :)).
Summarizing, my proposal is mostly concerned how to create block arrays from given arrays. I don't care about the name "stack". I just used "stack" because it replaced hstack/vstack for me. Maybe "bstack" for block stack, or "barray" for block array?
I have the feeling [1] that my use case is more common, but I like the second proposal.
Cheers, Stefan
[1] Everybody generalizes from oneself. At least I do. [2] http://docs.scipy.org/doc/numpy/reference/generated/numpy.bmat.html _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mon, Mar 16, 2015 at 1:50 AM, Stefan Otte <stefan.otte@gmail.com> wrote:
Summarizing, my proposal is mostly concerned how to create block arrays from given arrays. I don't care about the name "stack". I just used "stack" because it replaced hstack/vstack for me. Maybe "bstack" for block stack, or "barray" for block array?
Stefan -- thanks for sharing your perspective! In conclusion, it sounds like we could safely use "stack" for my PR (proposal 2), and use another name (perhaps "block", "barray" or "block_array") for your proposal. I'm also not opposed to using a new verb for my PR (the stacking alternative to "concatenate"), but I haven't come up with any more descriptive alternatives.
participants (3)
-
Nathaniel Smith
-
Stefan Otte
-
Stephan Hoyer