Added atleast_nd, request for clarification/cleanup of atleast_3d
Hi,
I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).
As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size1 dimension. If the input is 2D, it appends a size1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).
 Is there any reason for this behavior?  Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.
Thanks,
Joe
On Jul 5, 2016 9:09 PM, "Joseph FoxRabinovitz" jfoxrabinovitz@gmail.com wrote:
Hi,
I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).
As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size1 dimension. If the input is 2D, it appends a size1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).
 Is there any reason for this behavior?
 Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in
terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.
Changing atleast_3d seems likely to break a bunch of stuff...
Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error"  atleast_2d is zeroforthree on that requirements list.)
I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?
Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?
n
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith njs@pobox.com wrote:
On Jul 5, 2016 9:09 PM, "Joseph FoxRabinovitz" jfoxrabinovitz@gmail.com
wrote:
Hi,
I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).
As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size1 dimension. If the input is 2D, it appends a size1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).
 Is there any reason for this behavior?
 Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in
terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.
Changing atleast_3d seems likely to break a bunch of stuff...
Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error"  atleast_2d is zeroforthree on that requirements list.)
I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?
Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?
I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.
Ralf
On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers ralf.gommers@gmail.com wrote:
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith njs@pobox.com wrote:
On Jul 5, 2016 9:09 PM, "Joseph FoxRabinovitz" jfoxrabinovitz@gmail.com
wrote:
Hi,
I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).
As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size1 dimension. If the input is 2D, it appends a size1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).
 Is there any reason for this behavior?
 Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in
terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.
Changing atleast_3d seems likely to break a bunch of stuff...
Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error"  atleast_2d is zeroforthree on that requirements list.)
I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?
Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?
I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.
As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes
We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary.
roughly the equivalent of
if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x)
Josef
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.
Ralf
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
On Wed, Jul 6, 2016 at 3:29 AM, josef.pktd@gmail.com wrote:
On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers ralf.gommers@gmail.com wrote:
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith njs@pobox.com wrote:
On Jul 5, 2016 9:09 PM, "Joseph FoxRabinovitz" jfoxrabinovitz@gmail.com
wrote:
Hi,
I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).
As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size1 dimension. If the input is 2D, it appends a size1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).
 Is there any reason for this behavior?
 Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in
terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.
Changing atleast_3d seems likely to break a bunch of stuff...
Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error"  atleast_2d is zeroforthree on that requirements list.)
I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?
Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?
I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.
As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes
We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary.
roughly the equivalent of
if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x)
Josef
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.
statsmodels has currently very little code with ndim >2, so I have no overview of possible use cases, but it would be necessary to have full control over the added axis since axis have a strict meaning and stats still prefer Fortran order to default numpy/C ordering.
Josef
Ralf
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
I can add a keywordonly argument that lets you put the new dims before or after the existing ones. I am not sure how to specify arbitrary patterns for the new dimensions, but that should take care of most use cases.
The use case that motivated this function in the first place is that I am doing some processing on 4D arrays and I need to reduce them but return a result with the original dimensionality (but not shape). atleast_nd seemed like a better solution than atleast_4d.
Joe
On Wed, Jul 6, 2016 at 3:41 AM, josef.pktd@gmail.com wrote:
On Wed, Jul 6, 2016 at 3:29 AM, josef.pktd@gmail.com wrote:
On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers ralf.gommers@gmail.com wrote:
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith njs@pobox.com wrote:
On Jul 5, 2016 9:09 PM, "Joseph FoxRabinovitz" jfoxrabinovitz@gmail.com wrote:
Hi,
I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).
As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size1 dimension. If the input is 2D, it appends a size1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).
 Is there any reason for this behavior?
 Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in
terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.
Changing atleast_3d seems likely to break a bunch of stuff...
Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error"  atleast_2d is zeroforthree on that requirements list.)
I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?
Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?
I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.
As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes
We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary.
roughly the equivalent of
if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x)
Josef
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.
statsmodels has currently very little code with ndim >2, so I have no overview of possible use cases, but it would be necessary to have full control over the added axis since axis have a strict meaning and stats still prefer Fortran order to default numpy/C ordering.
Josef
Ralf
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
While atleast_1d/2d/3d predates my involvement in numpy, I am probably partly to blame for popularizing them as I helped to fix them up a fair amount. I wouldn't call its use "guessing". Rather, I would treat them as useful input sanitizers. If your function is going to be doing 2d indexing on an input, then it is very convenient to have atleast_2d() at the top of your function, not only to sanitize the input, but to make it clear that your code expects at least two dimensions.
One place where it is used is in np.loadtxt(..., ndmin=N) to protect against the situation of a single row of data becoming a 1D array rather than a 2D array (or an empty text file returning something completely useless).
I have previously pointed out the oddity with atleast_3d(). I can't remember the explanation I got though. Maybe someone can find the old thread that has the explanation, if any?
I think the keyword argument approach for controlling the behavior might be a good approach, provided that a suitable design could be devised. 1 & 2 dimensions is fairly trivial to control, but 3+ dimensions has too many degrees of freedom for me to consider.
Cheers! Ben Root
On Wed, Jul 6, 2016 at 9:12 AM, Joseph FoxRabinovitz < jfoxrabinovitz@gmail.com> wrote:
I can add a keywordonly argument that lets you put the new dims before or after the existing ones. I am not sure how to specify arbitrary patterns for the new dimensions, but that should take care of most use cases.
The use case that motivated this function in the first place is that I am doing some processing on 4D arrays and I need to reduce them but return a result with the original dimensionality (but not shape). atleast_nd seemed like a better solution than atleast_4d.
Joe
On Wed, Jul 6, 2016 at 3:41 AM, josef.pktd@gmail.com wrote:
On Wed, Jul 6, 2016 at 3:29 AM, josef.pktd@gmail.com wrote:
On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers ralf.gommers@gmail.com wrote:
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith njs@pobox.com wrote:
On Jul 5, 2016 9:09 PM, "Joseph FoxRabinovitz" jfoxrabinovitz@gmail.com wrote:
Hi,
I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with
a
function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).
As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size1 dimension. If the input is 2D, it appends a size1 dimension. This
is
inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).
 Is there any reason for this behavior?
 Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in
terms of `np.atleast_nd`, which is actually much simpler)? This
would
be a slight API change since the output would not be exactly the
same.
Changing atleast_3d seems likely to break a bunch of stuff...
Beyond that, I find it hard to have an opinion about the best design
for
these functions, because I don't think I've ever encountered a
situation
where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess"
reasons. And
then generally if I do want to coerce an array to another dimension,
then I
have some opinion about where the new dimensions should go, and/or I
have
some opinion about the minimum acceptable starting dimension, and/or
I have
a maximum dimension in mind. (E.g. "coerce 1d inputs into a column
matrix;
0d or 3d inputs are an error"  atleast_2d is zeroforthree on that requirements list.)
I don't know how typical I am in this. But it does make me wonder if
the
atleast_* functions act as an attractive nuisance, where new users
take
their presence as an implicit recommendation that they are actually a
useful
thing to reach for, even though they... aren't that. And maybe we
should be
recommending folk move away from them rather than trying to extend
them
further?
Or maybe they're totally useful and I'm just missing it. What's your
use
case that motivates atleast_nd?
I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the
large
majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to
be
treated as columns instead of rows, but that's still efficient and
readable
enough.
As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes
We stilll need in many cases the atleast_2d_cols, that appends the
newaxis
if necessary.
roughly the equivalent of
if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x)
Josef
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.
statsmodels has currently very little code with ndim >2, so I have no overview of possible use cases, but it would be necessary to have full control over the added axis since axis have a strict meaning and stats
still
prefer Fortran order to default numpy/C ordering.
Josef
Ralf
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
Hi All,
I'm with Nathaniel here, in that I don't really see the point of these routines in the first place: broadcasting takes care of many of the initial use cases one might think of, and others are generally not all that well served by them: the examples from scipy to me do not really support `at_least?d`, but rather suggest that little thought has been put into higherdimensional objects which should be treated as stacks of row or column vectors. My sense is that we're better off developing the direction started with `matmul`, perhaps adding `matvecmul` etc.
More to the point of the initial inquiry: what is the advantage of having a general `np.at_leastnd` routine over doing ``` np.array(a, copy=False, ndim=n) ``` or, for a list of inputs, ``` [np.array(a, copy=False, ndim=n) for a in input_list] ```
All the best,
Marten
We use np.at_least2d extensively in scikitimage, and I also use it in a *lot* of my own code now that scikitlearn stopped accepting 1D arrays as feature vectors.
what is the advantage of np.at_leastnd` over `np.array(a, copy=False,
ndim=n)`
Readability, clearly.
My only concern is the described behavior of np.at_least3d, which came as a surprise. I certainly would expect the “at_least” family to all work in the same way as broadcasting, ie prepending singleton dimensions. Prepend/append behavior can be controlled either by keyword or simply by using .T, I don’t mind either way.
Juan.
On 6 July 2016 at 10:22:15 AM, Marten van Kerkwijk ( m.h.vankerkwijk@gmail.com) wrote:
Hi All,
I'm with Nathaniel here, in that I don't really see the point of these routines in the first place: broadcasting takes care of many of the initial use cases one might think of, and others are generally not all that well served by them: the examples from scipy to me do not really support `at_least?d`, but rather suggest that little thought has been put into higherdimensional objects which should be treated as stacks of row or column vectors. My sense is that we're better off developing the direction started with `matmul`, perhaps adding `matvecmul` etc.
More to the point of the initial inquiry: what is the advantage of having a general `np.at_leastnd` routine over doing ``` np.array(a, copy=False, ndim=n) ``` or, for a list of inputs, ``` [np.array(a, copy=False, ndim=n) for a in input_list] ```
All the best,
Marten _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
On Mi, 20160706 at 10:22 0400, Marten van Kerkwijk wrote:
Hi All,
I'm with Nathaniel here, in that I don't really see the point of these routines in the first place: broadcasting takes care of many of the initial use cases one might think of, and others are generally not all that well served by them: the examples from scipy to me do not really support `at_least?d`, but rather suggest that little thought has been put into higherdimensional objects which should be treated as stacks of row or column vectors. My sense is that we're better off developing the direction started with `matmul`, perhaps adding `matvecmul` etc.
More to the point of the initial inquiry: what is the advantage of having a general `np.at_leastnd` routine over doing
There is another wonky reason for using the atleast_?d functions, in that they use reshape to be fully duck typed ;) (in newer versions at least, probably mostly for sparse arrays, not sure).
Tend to agree though, especially considering the confusing order of 3d, which I suppose is likely due to some linalg considerations. Of course you could supply something like an insertion order of (1, 0, 2) to denote the current 3D case in the nd one, but frankly it seems to me likely harder to understand how it works then to write your own functions to just do it.
Scipy uses the 3D case exactly never (once in a test). I have my doubts many would notice if we deprecate the 3D case, but then it is likely more trouble then gain.
 Sebastian
np.array(a, copy=False, ndim=n)
or, for a list of inputs,
[np.array(a, copy=False, ndim=n) for a in input_list]
All the best,
Marten _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
On Jul 6, 2016 6:12 AM, "Joseph FoxRabinovitz" jfoxrabinovitz@gmail.com wrote:
I can add a keywordonly argument that lets you put the new dims before or after the existing ones. I am not sure how to specify arbitrary patterns for the new dimensions, but that should take care of most use cases.
The use case that motivated this function in the first place is that I am doing some processing on 4D arrays and I need to reduce them but return a result with the original dimensionality (but not shape). atleast_nd seemed like a better solution than atleast_4d.
This is a tangent that might not apply given the details of your code, but isn't this what keepdims is for? (And keepdims has the huge advantage that it knows which axes are being reduced and thus where to put the new axes.)
I guess even if I couldn't use keepdims for some reason, my inclination would be to try to emulate it by fixing up the axes as I went, because I'd find it easier to verify that I hadn't accidentally misaligned things if the reductions and fixups were local to each other, and explicit axis insertions are much easier than trying to remember whether atleast_nd prepends or appends. This of course is all based on some vague guess at what your code actually looks like though...
n
I was using "reduce" in an abstract sense. I put in a 4D array in, get a 13D array out, depending on some other parameters (not strictly just by reduction, although that is the net effect). The placement of the dimensions is irrelevant, I just need to make the output 4D again for further calculations. Since I can have cases where the output is of different number of dims, I wrote this function as a handy tool to avoid conditionals.
I realize that this is not a common usecase, but it seemed like a thing someone else might find useful one day.
Joe
On Wed, Jul 6, 2016 at 12:35 PM, Nathaniel Smith njs@pobox.com wrote:
On Jul 6, 2016 6:12 AM, "Joseph FoxRabinovitz" jfoxrabinovitz@gmail.com wrote:
I can add a keywordonly argument that lets you put the new dims before or after the existing ones. I am not sure how to specify arbitrary patterns for the new dimensions, but that should take care of most use cases.
The use case that motivated this function in the first place is that I am doing some processing on 4D arrays and I need to reduce them but return a result with the original dimensionality (but not shape). atleast_nd seemed like a better solution than atleast_4d.
This is a tangent that might not apply given the details of your code, but isn't this what keepdims is for? (And keepdims has the huge advantage that it knows which axes are being reduced and thus where to put the new axes.)
I guess even if I couldn't use keepdims for some reason, my inclination would be to try to emulate it by fixing up the axes as I went, because I'd find it easier to verify that I hadn't accidentally misaligned things if the reductions and fixups were local to each other, and explicit axis insertions are much easier than trying to remember whether atleast_nd prepends or appends. This of course is all based on some vague guess at what your code actually looks like though...
n
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
On Tue, Jul 5, 2016 at 10:06 PM, Nathaniel Smith njs@pobox.com wrote:
I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?
Agreed. I would avoid adding atleast_nd. We could discourage using atleast_3d (certainly the behavior is indeed surprising), but I'm not sure it's worth the trouble.
I still think this function is useful. I have made a change so that it only accepts one array, as Marten suggested, making the API much cleaner than that of its siblings. The side on which the new dimensions will be added is configurable via the `where` parameter, which currently accepts 'before' and 'after', but can be changed to accept sequences or even dicts. The change also resulted in finding a bug in the masked array versions of the atleast functions, which the PR now fixes and adds regression tests for. If the devs do decide to discard this PR, I will of course submit the bug fix separately.
Joe
On Wed, Jul 6, 2016 at 1:43 PM, Stephan Hoyer shoyer@gmail.com wrote:
On Tue, Jul 5, 2016 at 10:06 PM, Nathaniel Smith njs@pobox.com wrote:
I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?
Agreed. I would avoid adding atleast_nd. We could discourage using atleast_3d (certainly the behavior is indeed surprising), but I'm not sure it's worth the trouble.
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy.
On Wed, Jul 6, 2016 at 2:21 PM, Joseph FoxRabinovitz < jfoxrabinovitz@gmail.com> wrote:
I still think this function is useful. I have made a change so that it only accepts one array, as Marten suggested, making the API much cleaner than that of its siblings. The side on which the new dimensions will be added is configurable via the `where` parameter, which currently accepts 'before' and 'after', but can be changed to accept sequences or even dicts. The change also resulted in finding a bug in the masked array versions of the atleast functions, which the PR now fixes and adds regression tests for. If the devs do decide to discard this PR, I will of course submit the bug fix separately.
Joe
On Wed, Jul 6, 2016 at 1:43 PM, Stephan Hoyer shoyer@gmail.com wrote:
On Tue, Jul 5, 2016 at 10:06 PM, Nathaniel Smith njs@pobox.com wrote:
I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a
useful
thing to reach for, even though they... aren't that. And maybe we
should be
recommending folk move away from them rather than trying to extend them further?
Agreed. I would avoid adding atleast_nd. We could discourage using atleast_3d (certainly the behavior is indeed surprising), but I'm not
sure
it's worth the trouble.
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
Agreed. I was originally going with "side", but I want something that can be changed to accepting arbitrary specs without changing the word. Perhaps "pos"? I am open to suggestion.
Joe
On Wed, Jul 6, 2016 at 2:25 PM, Benjamin Root ben.v.root@gmail.com wrote:
I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy.
On Wed, Jul 6, 2016 at 2:21 PM, Joseph FoxRabinovitz jfoxrabinovitz@gmail.com wrote:
I still think this function is useful. I have made a change so that it only accepts one array, as Marten suggested, making the API much cleaner than that of its siblings. The side on which the new dimensions will be added is configurable via the `where` parameter, which currently accepts 'before' and 'after', but can be changed to accept sequences or even dicts. The change also resulted in finding a bug in the masked array versions of the atleast functions, which the PR now fixes and adds regression tests for. If the devs do decide to discard this PR, I will of course submit the bug fix separately.
Joe
On Wed, Jul 6, 2016 at 1:43 PM, Stephan Hoyer shoyer@gmail.com wrote:
On Tue, Jul 5, 2016 at 10:06 PM, Nathaniel Smith njs@pobox.com wrote:
I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?
Agreed. I would avoid adding atleast_nd. We could discourage using atleast_3d (certainly the behavior is indeed surprising), but I'm not sure it's worth the trouble.
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
On 2016/07/06 8:25 AM, Benjamin Root wrote:
I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy.
Agreed. Maybe "side"?
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)
Eric
at_leastnd would be useful for nd image processing in a very analogous way to how at_least2d is used by scikitimage, assuming it prepends. The at_least3d choice is baffling, seems analogous to the 0.5based indexing presented at PyCon, and should be "fun" to deprecate. =P
On 6 July 2016 at 2:57:57 PM, Eric Firing (efiring@hawaii.edu) wrote:
On 2016/07/06 8:25 AM, Benjamin Root wrote:
I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy.
Agreed. Maybe "side"?
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)
Eric _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
On Wed, Jul 6, 2016 at 3:01 PM, Juan NunezIglesias jni.soma@gmail.com wrote:
at_leastnd would be useful for nd image processing in a very analogous way to how at_least2d is used by scikitimage, assuming it prepends. The at_least3d choice is baffling, seems analogous to the 0.5based indexing presented at PyCon, and should be "fun" to deprecate. =P
at_leastnd prepends by default, has an option to append instead and certainly does not 0.5pend under any circumstances. `np.swapaxes` and `np.rollaxis` are there for a reason. If atleast_3d is deprecated because of its funky behavior, atleast_nd may be useful replacement.
Joe
On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing efiring@hawaii.edu wrote:
On 2016/07/06 8:25 AM, Benjamin Root wrote:
I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy.
Agreed. Maybe "side"?
I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.
Joe
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)
Eric
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.
On Wed, Jul 6, 2016 at 3:20 PM, Joseph FoxRabinovitz < jfoxrabinovitz@gmail.com> wrote:
On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing efiring@hawaii.edu wrote:
On 2016/07/06 8:25 AM, Benjamin Root wrote:
I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy.
Agreed. Maybe "side"?
I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.
Joe
(I find atleast_1d and atleast_2d to be very helpful for handling
inputs, as
Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)
Eric
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
Joseph FoxRabinovitz <jfoxrabinovitz <at> gmail.com> writes:
On Wed, Jul 6, 2016 at 2:57 PM, Eric
Firing <efiring <at> hawaii.edu> wrote:
On 2016/07/06 8:25 AM, Benjamin Root
wrote:
I wouldn't have the keyword be
"where", as that collides with the notion
of "where" elsewhere in numpy.
Agreed. Maybe "side"?
I have tentatively changed it to "pos".
The reason that I don't like
"side" is that it implies only a subset
of the possible ways that that
the position of the new dimensions can
be specified. The current
implementation only puts things on one
side or the other, but I have
considered also allowing an array of
indices at which to place new
dimensions, and/or a dictionary keyed by
the starting ndims. I do not
think "side" would be appropriate for
these extended cases, even if
they are very unlikely to ever
materialize.
Joe
(I find atleast_1d and atleast_2d to
be very helpful for handling inputs, as
Ben noted; I'm skeptical as to the
value of atleast_3d and atleast_nd.)
Eric
__________________________________________ _____
NumPyDiscussion mailing list NumPyDiscussion <at> scipy.org
https://mail.scipy.org/mailman/listinfo/nu mpydiscussion
About `order='C'` or `order='F'` for the argument name?
On Mi, 20160706 at 15:30 0400, Benjamin Root wrote:
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you
You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course). However, I have my doubts that it is actually easier to understand then to write yourself ;).
 Sebastian
don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.
On Wed, Jul 6, 2016 at 3:20 PM, Joseph FoxRabinovitz <jfoxrabinovitz @gmail.com> wrote:
On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing efiring@hawaii.edu wrote:
On 2016/07/06 8:25 AM, Benjamin Root wrote:
I wouldn't have the keyword be "where", as that collides with
the notion
of "where" elsewhere in numpy.
Agreed. Maybe "side"?
I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.
Joe
(I find atleast_1d and atleast_2d to be very helpful for handling
inputs, as
Ben noted; I'm skeptical as to the value of atleast_3d and
atleast_nd.)
Eric
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg sebastian@sipsolutions.net wrote:
On Mi, 20160706 at 15:30 0400, Benjamin Root wrote:
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you
You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course).
I was originally thinking (1, 0) for the 2D case. Just go along the list and fill as many dims as necessary. Your way is much better since it does not require a different operation for positive and negative indices.
However, I have my doubts that it is actually easier to understand then to write yourself ;).
A dictionary or ragged list would be better for that: either {1: (1, 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the index in the list is the starting ndim  1.
 Sebastian
don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.
On Wed, Jul 6, 2016 at 3:20 PM, Joseph FoxRabinovitz <jfoxrabinovitz @gmail.com> wrote:
On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing efiring@hawaii.edu wrote:
On 2016/07/06 8:25 AM, Benjamin Root wrote:
I wouldn't have the keyword be "where", as that collides with
the notion
of "where" elsewhere in numpy.
Agreed. Maybe "side"?
I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.
Joe
(I find atleast_1d and atleast_2d to be very helpful for handling
inputs, as
Ben noted; I'm skeptical as to the value of atleast_3d and
atleast_nd.)
Eric
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
I would like to follow up on my original PR (7804). While there appears to be some debate as to whether the PR is numpy material to begin with, there do not appear to be any technical issues with it. To make the decision more straightforward, I factored out the noncontroversial bug fixes to masked arrays into PR #7823, along with their regression tests. This way, the original enhancement can be closed or left hanging indefinitely, (even though I hope neither happens). PR 7804 still has the bug fixes duplicated in it.
Regards,
Joe
On Thu, Jul 7, 2016 at 9:11 AM, Joseph FoxRabinovitz jfoxrabinovitz@gmail.com wrote:
On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg sebastian@sipsolutions.net wrote:
On Mi, 20160706 at 15:30 0400, Benjamin Root wrote:
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you
You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course).
I was originally thinking (1, 0) for the 2D case. Just go along the list and fill as many dims as necessary. Your way is much better since it does not require a different operation for positive and negative indices.
However, I have my doubts that it is actually easier to understand then to write yourself ;).
A dictionary or ragged list would be better for that: either {1: (1, 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the index in the list is the starting ndim  1.
 Sebastian
don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.
On Wed, Jul 6, 2016 at 3:20 PM, Joseph FoxRabinovitz <jfoxrabinovitz @gmail.com> wrote:
On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing efiring@hawaii.edu wrote:
On 2016/07/06 8:25 AM, Benjamin Root wrote:
I wouldn't have the keyword be "where", as that collides with
the notion
of "where" elsewhere in numpy.
Agreed. Maybe "side"?
I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.
Joe
(I find atleast_1d and atleast_2d to be very helpful for handling
inputs, as
Ben noted; I'm skeptical as to the value of atleast_3d and
atleast_nd.)
Eric
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
Hi,
I would like to revitalize the discussion on including PR#7804 (atleast_nd function) at Stephan Hoyer's request. atleast_nd has come up as a convenient workaround for #8206 (adding padding options to diff) to be able to do broadcasting with the required dimensions reversed.
Regards,
Joe
On Mon, Jul 11, 2016 at 10:41 AM, Joseph FoxRabinovitz < jfoxrabinovitz@gmail.com> wrote:
I would like to follow up on my original PR (7804). While there appears to be some debate as to whether the PR is numpy material to begin with, there do not appear to be any technical issues with it. To make the decision more straightforward, I factored out the noncontroversial bug fixes to masked arrays into PR #7823, along with their regression tests. This way, the original enhancement can be closed or left hanging indefinitely, (even though I hope neither happens). PR 7804 still has the bug fixes duplicated in it.
Regards,
Joe
On Thu, Jul 7, 2016 at 9:11 AM, Joseph FoxRabinovitz jfoxrabinovitz@gmail.com wrote:
On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg sebastian@sipsolutions.net wrote:
On Mi, 20160706 at 15:30 0400, Benjamin Root wrote:
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you
You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course).
I was originally thinking (1, 0) for the 2D case. Just go along the list and fill as many dims as necessary. Your way is much better since it does not require a different operation for positive and negative indices.
However, I have my doubts that it is actually easier to understand then to write yourself ;).
A dictionary or ragged list would be better for that: either {1: (1, 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the index in the list is the starting ndim  1.
 Sebastian
don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.
On Wed, Jul 6, 2016 at 3:20 PM, Joseph FoxRabinovitz <jfoxrabinovitz @gmail.com> wrote:
On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing efiring@hawaii.edu wrote:
On 2016/07/06 8:25 AM, Benjamin Root wrote: > > I wouldn't have the keyword be "where", as that collides with
the notion
> of "where" elsewhere in numpy.
Agreed. Maybe "side"?
I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.
Joe
(I find atleast_1d and atleast_2d to be very helpful for handling
inputs, as
Ben noted; I'm skeptical as to the value of atleast_3d and
atleast_nd.)
Eric
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpydiscussion
participants (11)

Benjamin Root

David

Eric Firing

josef.pktd＠gmail.com

Joseph FoxRabinovitz

Juan NunezIglesias

Marten van Kerkwijk

Nathaniel Smith

Ralf Gommers

Sebastian Berg

Stephan Hoyer