Mailman 3 Added atleast_nd, request for clarification/cleanup of atleast_3d - NumPy-Discussion

Added atleast_nd, request for clarification/cleanup of atleast_3d

Joseph Fox-Rabinovitz

July 6, 2016

4:09 a.m.

Hi, I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804). As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size-1 dimension. If the input is 2D, it appends a size-1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`). - Is there any reason for this behavior? - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same. Thanks, -Joe

Show replies by date

Nathaniel Smith

July 2016

5:06 a.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com> wrote:

...

Changing atleast_3d seems likely to break a bunch of stuff... Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.) I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further? Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd? -n

Ralf Gommers

6:21 a.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs@pobox.com> wrote: On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com>

...

I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough. For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine. Ralf

josef.pktd＠gmail.com

7:29 a.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

...

On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs@pobox.com> wrote:

On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com>

...
wrote:

...
Hi,

I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).

As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size-1 dimension. If the input is 2D, it appends a size-1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).

- Is there any reason for this behavior? - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.

Changing atleast_3d seems likely to break a bunch of stuff...

Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.)

I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?

Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?

I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.

As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary. roughly the equivalent of if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x) Josef

...

josef.pktd＠gmail.com

7:41 a.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Wed, Jul 6, 2016 at 3:29 AM, <josef.pktd@gmail.com> wrote:

...

On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs@pobox.com> wrote:

On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com>

...
wrote:

...
Hi,

I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).

As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size-1 dimension. If the input is 2D, it appends a size-1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).

- Is there any reason for this behavior? - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.

Changing atleast_3d seems likely to break a bunch of stuff...

Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.)

I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?

Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?

I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.

As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes

We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary.

roughly the equivalent of

if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x)

Josef

...
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.

statsmodels has currently very little code with ndim >2, so I have no overview of possible use cases, but it would be necessary to have full control over the added axis since axis have a strict meaning and stats still prefer Fortran order to default numpy/C ordering. Josef

...

Joseph Fox-Rabinovitz

1:12 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

I can add a keyword-only argument that lets you put the new dims before or after the existing ones. I am not sure how to specify arbitrary patterns for the new dimensions, but that should take care of most use cases. The use case that motivated this function in the first place is that I am doing some processing on 4D arrays and I need to reduce them but return a result with the original dimensionality (but not shape). atleast_nd seemed like a better solution than atleast_4d. -Joe On Wed, Jul 6, 2016 at 3:41 AM, <josef.pktd@gmail.com> wrote:

...

On Wed, Jul 6, 2016 at 3:29 AM, <josef.pktd@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs@pobox.com> wrote:

...
On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com> wrote:

...
Hi,

I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).

As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size-1 dimension. If the input is 2D, it appends a size-1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).

- Is there any reason for this behavior? - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.

Changing atleast_3d seems likely to break a bunch of stuff...

Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.)

I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?

Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?

I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.

As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes

We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary.

roughly the equivalent of

if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x)

Josef

...
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.

statsmodels has currently very little code with ndim >2, so I have no overview of possible use cases, but it would be necessary to have full control over the added axis since axis have a strict meaning and stats still prefer Fortran order to default numpy/C ordering.

Josef

...
...
Ralf

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Benjamin Root

1:51 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

While atleast_1d/2d/3d predates my involvement in numpy, I am probably partly to blame for popularizing them as I helped to fix them up a fair amount. I wouldn't call its use "guessing". Rather, I would treat them as useful input sanitizers. If your function is going to be doing 2d indexing on an input, then it is very convenient to have atleast_2d() at the top of your function, not only to sanitize the input, but to make it clear that your code expects at least two dimensions. One place where it is used is in np.loadtxt(..., ndmin=N) to protect against the situation of a single row of data becoming a 1-D array rather than a 2-D array (or an empty text file returning something completely useless). I have previously pointed out the oddity with atleast_3d(). I can't remember the explanation I got though. Maybe someone can find the old thread that has the explanation, if any? I think the keyword argument approach for controlling the behavior might be a good approach, provided that a suitable design could be devised. 1 & 2 dimensions is fairly trivial to control, but 3+ dimensions has too many degrees of freedom for me to consider. Cheers! Ben Root On Wed, Jul 6, 2016 at 9:12 AM, Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:

...

I can add a keyword-only argument that lets you put the new dims before or after the existing ones. I am not sure how to specify arbitrary patterns for the new dimensions, but that should take care of most use cases.

The use case that motivated this function in the first place is that I am doing some processing on 4D arrays and I need to reduce them but return a result with the original dimensionality (but not shape). atleast_nd seemed like a better solution than atleast_4d.

-Joe

On Wed, Jul 6, 2016 at 3:41 AM, <josef.pktd@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 3:29 AM, <josef.pktd@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs@pobox.com> wrote:

...
On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com> wrote:

...
Hi,

I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with

...
...
...
...
...
function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).

As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size-1 dimension. If the input is 2D, it appends a size-1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).

- Is there any reason for this behavior? - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.

Changing atleast_3d seems likely to break a bunch of stuff...

Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension,

...
...
...
...
have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.)

I don't know how typical I am in this. But it does make me wonder if

...
...
...
...
atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend

a then I the them

...
...
...
...
further?

Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?

I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.

As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes

We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary.

roughly the equivalent of

if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x)

Josef

...
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.

statsmodels has currently very little code with ndim >2, so I have no overview of possible use cases, but it would be necessary to have full control over the added axis since axis have a strict meaning and stats still prefer Fortran order to default numpy/C ordering.

Josef

...
...
Ralf

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Marten van Kerkwijk

2:22 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

Hi All, I'm with Nathaniel here, in that I don't really see the point of these routines in the first place: broadcasting takes care of many of the initial use cases one might think of, and others are generally not all that well served by them: the examples from scipy to me do not really support `at_least?d`, but rather suggest that little thought has been put into higher-dimensional objects which should be treated as stacks of row or column vectors. My sense is that we're better off developing the direction started with `matmul`, perhaps adding `matvecmul` etc. More to the point of the initial inquiry: what is the advantage of having a general `np.at_leastnd` routine over doing ``` np.array(a, copy=False, ndim=n) ``` or, for a list of inputs, ``` [np.array(a, copy=False, ndim=n) for a in input_list] ``` All the best, Marten

Juan Nunez-Iglesias

2:56 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

We use np.at_least2d extensively in scikit-image, and I also use it in a *lot* of my own code now that scikit-learn stopped accepting 1D arrays as feature vectors.

...

what is the advantage of np.at_leastnd` over `np.array(a, copy=False, ndim=n)`

Readability, clearly. My only concern is the described behavior of np.at_least3d, which came as a surprise. I certainly would expect the “at_least” family to all work in the same way as broadcasting, ie prepending singleton dimensions. Prepend/append behavior can be controlled either by keyword or simply by using .T, I don’t mind either way. Juan. On 6 July 2016 at 10:22:15 AM, Marten van Kerkwijk ( m.h.vankerkwijk@gmail.com) wrote: Hi All, I'm with Nathaniel here, in that I don't really see the point of these routines in the first place: broadcasting takes care of many of the initial use cases one might think of, and others are generally not all that well served by them: the examples from scipy to me do not really support `at_least?d`, but rather suggest that little thought has been put into higher-dimensional objects which should be treated as stacks of row or column vectors. My sense is that we're better off developing the direction started with `matmul`, perhaps adding `matvecmul` etc. More to the point of the initial inquiry: what is the advantage of having a general `np.at_leastnd` routine over doing ``` np.array(a, copy=False, ndim=n) ``` or, for a list of inputs, ``` [np.array(a, copy=False, ndim=n) for a in input_list] ``` All the best, Marten _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Sebastian Berg

3:14 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Mi, 2016-07-06 at 10:22 -0400, Marten van Kerkwijk wrote:

...

There is another wonky reason for using the atleast_?d functions, in that they use reshape to be fully duck typed ;) (in newer versions at least, probably mostly for sparse arrays, not sure). Tend to agree though, especially considering the confusing order of 3d, which I suppose is likely due to some linalg considerations. Of course you could supply something like an insertion order of (1, 0, 2) to denote the current 3D case in the nd one, but frankly it seems to me likely harder to understand how it works then to write your own functions to just do it. Scipy uses the 3D case exactly never (once in a test). I have my doubts many would notice if we deprecate the 3D case, but then it is likely more trouble then gain. - Sebastian

...

Nathaniel Smith

4:35 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Jul 6, 2016 6:12 AM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com> wrote:

...

This is a tangent that might not apply given the details of your code, but isn't this what keepdims is for? (And keepdims has the huge advantage that it knows which axes are being reduced and thus where to put the new axes.) I guess even if I couldn't use keepdims for some reason, my inclination would be to try to emulate it by fixing up the axes as I went, because I'd find it easier to verify that I hadn't accidentally misaligned things if the reductions and fix-ups were local to each other, and explicit axis insertions are much easier than trying to remember whether atleast_nd prepends or appends. This of course is all based on some vague guess at what your code actually looks like though... -n

Joseph Fox-Rabinovitz

4:48 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

I was using "reduce" in an abstract sense. I put in a 4D array in, get a 1-3D array out, depending on some other parameters (not strictly just by reduction, although that is the net effect). The placement of the dimensions is irrelevant, I just need to make the output 4D again for further calculations. Since I can have cases where the output is of different number of dims, I wrote this function as a handy tool to avoid conditionals. I realize that this is not a common use-case, but it seemed like a thing someone else might find useful one day. -Joe On Wed, Jul 6, 2016 at 12:35 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

Stephan Hoyer

5:43 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Tue, Jul 5, 2016 at 10:06 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

Agreed. I would avoid adding atleast_nd. We could discourage using atleast_3d (certainly the behavior is indeed surprising), but I'm not sure it's worth the trouble.

Joseph Fox-Rabinovitz

6:21 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

I still think this function is useful. I have made a change so that it only accepts one array, as Marten suggested, making the API much cleaner than that of its siblings. The side on which the new dimensions will be added is configurable via the `where` parameter, which currently accepts 'before' and 'after', but can be changed to accept sequences or even dicts. The change also resulted in finding a bug in the masked array versions of the atleast functions, which the PR now fixes and adds regression tests for. If the devs do decide to discard this PR, I will of course submit the bug fix separately. -Joe On Wed, Jul 6, 2016 at 1:43 PM, Stephan Hoyer <shoyer@gmail.com> wrote:

...

Benjamin Root

6:25 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy. On Wed, Jul 6, 2016 at 2:21 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:

...

Joseph Fox-Rabinovitz

6:36 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

Agreed. I was originally going with "side", but I want something that can be changed to accepting arbitrary specs without changing the word. Perhaps "pos"? I am open to suggestion. -Joe On Wed, Jul 6, 2016 at 2:25 PM, Benjamin Root <ben.v.root@gmail.com> wrote:

...

Eric Firing

6:57 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On 2016/07/06 8:25 AM, Benjamin Root wrote:

...

I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy.

Agreed. Maybe "side"? (I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.) Eric

Juan Nunez-Iglesias

7:01 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

at_leastnd would be useful for nd image processing in a very analogous way to how at_least2d is used by scikit-image, assuming it prepends. The at_least3d choice is baffling, seems analogous to the 0.5-based indexing presented at PyCon, and should be "fun" to deprecate. =P On 6 July 2016 at 2:57:57 PM, Eric Firing (efiring@hawaii.edu) wrote: On 2016/07/06 8:25 AM, Benjamin Root wrote:

...

I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy.

Agreed. Maybe "side"? (I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.) Eric _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Joseph Fox-Rabinovitz

7:15 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Wed, Jul 6, 2016 at 3:01 PM, Juan Nunez-Iglesias <jni.soma@gmail.com> wrote:

...

at_leastnd prepends by default, has an option to append instead and certainly does not 0.5-pend under any circumstances. `np.swapaxes` and `np.rollaxis` are there for a reason. If atleast_3d is deprecated because of its funky behavior, atleast_nd may be useful replacement. -Joe

Joseph Fox-Rabinovitz

7:20 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing <efiring@hawaii.edu> wrote:

...

I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize. -Joe

...

Benjamin Root

7:30 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements. On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:

...

Sebastian Berg

8:34 a.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote:

...

I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you

You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course). However, I have my doubts that it is actually easier to understand then to write yourself ;). - Sebastian

...

Joseph Fox-Rabinovitz

1:11 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

...

On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote:

...
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you

You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course).

I was originally thinking (-1, 0) for the 2D case. Just go along the list and fill as many dims as necessary. Your way is much better since it does not require a different operation for positive and negative indices.

...

However, I have my doubts that it is actually easier to understand then to write yourself ;).

A dictionary or ragged list would be better for that: either {1: (1, 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the index in the list is the starting ndim - 1.

...

- Sebastian

...
don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.

On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz <jfoxrabinovitz @gmail.com> wrote:

...
...
On 2016/07/06 8:25 AM, Benjamin Root wrote:

...
I wouldn't have the keyword be "where", as that collides with

On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing <efiring@hawaii.edu> wrote: the notion

...
...
of "where" elsewhere in numpy.

Agreed. Maybe "side"?

I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.

-Joe

...
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)

Eric

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Joseph Fox-Rabinovitz

2:41 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

I would like to follow up on my original PR (7804). While there appears to be some debate as to whether the PR is numpy material to begin with, there do not appear to be any technical issues with it. To make the decision more straightforward, I factored out the non-controversial bug fixes to masked arrays into PR #7823, along with their regression tests. This way, the original enhancement can be closed or left hanging indefinitely, (even though I hope neither happens). PR 7804 still has the bug fixes duplicated in it. Regards, -Joe On Thu, Jul 7, 2016 at 9:11 AM, Joseph Fox-Rabinovitz <jfoxrabinovitz@gmail.com> wrote:

...

On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

...
On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote:

...
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you

You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course).

I was originally thinking (-1, 0) for the 2D case. Just go along the list and fill as many dims as necessary. Your way is much better since it does not require a different operation for positive and negative indices.

...
However, I have my doubts that it is actually easier to understand then to write yourself ;).

A dictionary or ragged list would be better for that: either {1: (1, 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the index in the list is the starting ndim - 1.

...
- Sebastian

...
don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.

On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz <jfoxrabinovitz @gmail.com> wrote:

...
...
On 2016/07/06 8:25 AM, Benjamin Root wrote:

...
I wouldn't have the keyword be "where", as that collides with

On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing <efiring@hawaii.edu> wrote: the notion

...
...
of "where" elsewhere in numpy.

Agreed. Maybe "side"?

I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.

-Joe

...
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)

Eric

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Joseph Fox-Rabinovitz

October 2016

6:29 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

Hi, I would like to revitalize the discussion on including PR#7804 (atleast_nd function) at Stephan Hoyer's request. atleast_nd has come up as a convenient workaround for #8206 (adding padding options to diff) to be able to do broadcasting with the required dimensions reversed. Regards, -Joe On Mon, Jul 11, 2016 at 10:41 AM, Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:

...

I would like to follow up on my original PR (7804). While there appears to be some debate as to whether the PR is numpy material to begin with, there do not appear to be any technical issues with it. To make the decision more straightforward, I factored out the non-controversial bug fixes to masked arrays into PR #7823, along with their regression tests. This way, the original enhancement can be closed or left hanging indefinitely, (even though I hope neither happens). PR 7804 still has the bug fixes duplicated in it.

Regards,

-Joe

On Thu, Jul 7, 2016 at 9:11 AM, Joseph Fox-Rabinovitz <jfoxrabinovitz@gmail.com> wrote:

...
On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

...
On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote:

...
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you

You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course).

I was originally thinking (-1, 0) for the 2D case. Just go along the list and fill as many dims as necessary. Your way is much better since it does not require a different operation for positive and negative indices.

...
However, I have my doubts that it is actually easier to understand then to write yourself ;).

A dictionary or ragged list would be better for that: either {1: (1, 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the index in the list is the starting ndim - 1.

...
- Sebastian

...
don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.

On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz <jfoxrabinovitz @gmail.com> wrote:

...
...
On 2016/07/06 8:25 AM, Benjamin Root wrote: > > I wouldn't have the keyword be "where", as that collides with

On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing <efiring@hawaii.edu> wrote: the notion

...
> of "where" elsewhere in numpy.

Agreed. Maybe "side"?

I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.

-Joe

...
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)

Eric

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

David

July 2016

10:20 p.m.

New subject: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d

Joseph Fox-Rabinovitz <jfoxrabinovitz <at> gmail.com> writes:

...

On Wed, Jul 6, 2016 at 2:57 PM, Eric

...

...
On 2016/07/06 8:25 AM, Benjamin Root wrote:

...
I wouldn't have the keyword be

"where", as that collides with the notion

...
of "where" elsewhere in numpy.

Agreed. Maybe "side"?

I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by

...

think "side" would be appropriate for

Firing <efiring <at> hawaii.edu> wrote: the starting ndims. I do not these extended cases, even if

...

they are very unlikely to ever materialize.

-Joe

...
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)

Eric

_____

...

...
NumPy-Discussion mailing list NumPy-Discussion <at> scipy.org

https://mail.scipy.org/mailman/listinfo/nu mpy-discussion

About `order='C'` or `order='F'` for the argument name?

Nathaniel Smith

July 2016

5:06 a.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com> wrote:

...

Ralf Gommers

6:21 a.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs@pobox.com> wrote: On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com>

...

josef.pktd＠gmail.com

7:29 a.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

...

On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs@pobox.com> wrote:

On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com>

...
wrote:

...
Hi,

I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).

As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size-1 dimension. If the input is 2D, it appends a size-1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).

- Is there any reason for this behavior? - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.

Changing atleast_3d seems likely to break a bunch of stuff...

Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.)

I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?

Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?

I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.

...

josef.pktd＠gmail.com

7:41 a.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Wed, Jul 6, 2016 at 3:29 AM, <josef.pktd@gmail.com> wrote:

...

On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs@pobox.com> wrote:

On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com>

...
wrote:

...
Hi,

I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).

As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size-1 dimension. If the input is 2D, it appends a size-1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).

- Is there any reason for this behavior? - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.

Changing atleast_3d seems likely to break a bunch of stuff...

Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.)

I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?

Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?

I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.

As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes

We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary.

roughly the equivalent of

if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x)

Josef

...
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.

...

Joseph Fox-Rabinovitz

1:12 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

...

On Wed, Jul 6, 2016 at 3:29 AM, <josef.pktd@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs@pobox.com> wrote:

...
On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com> wrote:

...
Hi,

I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).

As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size-1 dimension. If the input is 2D, it appends a size-1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).

- Is there any reason for this behavior? - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.

Changing atleast_3d seems likely to break a bunch of stuff...

Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension, then I have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.)

I don't know how typical I am in this. But it does make me wonder if the atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend them further?

Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?

I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.

As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes

We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary.

roughly the equivalent of

if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x)

Josef

...
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.

statsmodels has currently very little code with ndim >2, so I have no overview of possible use cases, but it would be necessary to have full control over the added axis since axis have a strict meaning and stats still prefer Fortran order to default numpy/C ordering.

Josef

...
...
Ralf

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Benjamin Root

1:51 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

...

I can add a keyword-only argument that lets you put the new dims before or after the existing ones. I am not sure how to specify arbitrary patterns for the new dimensions, but that should take care of most use cases.

The use case that motivated this function in the first place is that I am doing some processing on 4D arrays and I need to reduce them but return a result with the original dimensionality (but not shape). atleast_nd seemed like a better solution than atleast_4d.

-Joe

On Wed, Jul 6, 2016 at 3:41 AM, <josef.pktd@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 3:29 AM, <josef.pktd@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 2:21 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

...
On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs@pobox.com> wrote:

...
On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com> wrote:

...
Hi,

I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with

...
...
...
...
...
function np.atleast_nd in PR#7804 (https://github.com/numpy/numpy/pull/7804).

As a result of this PR, I have a couple of questions about `np.atleast_3d`. `np.atleast_3d` appears to do something weird with the dimensions: If the input is 1D, it prepends and appends a size-1 dimension. If the input is 2D, it appends a size-1 dimension. This is inconsistent with `np.atleast_2d`, which always prepends (as does `np.atleast_nd`).

- Is there any reason for this behavior? - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in terms of `np.atleast_nd`, which is actually much simpler)? This would be a slight API change since the output would not be exactly the same.

Changing atleast_3d seems likely to break a bunch of stuff...

Beyond that, I find it hard to have an opinion about the best design for these functions, because I don't think I've ever encountered a situation where they were actually what I wanted. I'm not a big fan of coercing dimensions in the first place, for the usual "refuse to guess" reasons. And then generally if I do want to coerce an array to another dimension,

...
...
...
...
have some opinion about where the new dimensions should go, and/or I have some opinion about the minimum acceptable starting dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is zero-for-three on that requirements list.)

I don't know how typical I am in this. But it does make me wonder if

...
...
...
...
atleast_* functions act as an attractive nuisance, where new users take their presence as an implicit recommendation that they are actually a useful thing to reach for, even though they... aren't that. And maybe we should be recommending folk move away from them rather than trying to extend

a then I the them

...
...
...
...
further?

Or maybe they're totally useful and I'm just missing it. What's your use case that motivates atleast_nd?

I think you're just missing it:) atleast_1d/2d are used quite a bit in Scipy and Statsmodels (those are the only ones I checked), and in the large majority of cases it's the best thing to use there. There's a bunch of atleast_2d calls with a transpose appended because the input needs to be treated as columns instead of rows, but that's still efficient and readable enough.

As Ralph pointed out its usage in statsmodels. I do find them useful as replacement for several lines of ifs and reshapes

We stilll need in many cases the atleast_2d_cols, that appends the newaxis if necessary.

roughly the equivalent of

if x.ndim == 1: x = x[:, None] else: x = np.atleast_2d(x)

Josef

...
For 3D/nD I can see that you'd need more control over where the dimensions go, but 1D/2D are fine.

statsmodels has currently very little code with ndim >2, so I have no overview of possible use cases, but it would be necessary to have full control over the added axis since axis have a strict meaning and stats still prefer Fortran order to default numpy/C ordering.

Josef

...
...
Ralf

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Marten van Kerkwijk

July 2016

2:22 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

Juan Nunez-Iglesias

2:56 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

We use np.at_least2d extensively in scikit-image, and I also use it in a *lot* of my own code now that scikit-learn stopped accepting 1D arrays as feature vectors.

...

what is the advantage of np.at_leastnd` over `np.array(a, copy=False, ndim=n)`

Sebastian Berg

3:14 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Mi, 2016-07-06 at 10:22 -0400, Marten van Kerkwijk wrote:

...

Nathaniel Smith

4:35 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Jul 6, 2016 6:12 AM, "Joseph Fox-Rabinovitz" <jfoxrabinovitz@gmail.com> wrote:

...

Joseph Fox-Rabinovitz

4:48 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

...

Stephan Hoyer

5:43 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Tue, Jul 5, 2016 at 10:06 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

Agreed. I would avoid adding atleast_nd. We could discourage using atleast_3d (certainly the behavior is indeed surprising), but I'm not sure it's worth the trouble.

Joseph Fox-Rabinovitz

July 2016

6:21 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

...

Benjamin Root

6:25 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy. On Wed, Jul 6, 2016 at 2:21 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:

...

Joseph Fox-Rabinovitz

6:36 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

...

Eric Firing

6:57 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On 2016/07/06 8:25 AM, Benjamin Root wrote:

...

I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy.

Agreed. Maybe "side"? (I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.) Eric

Juan Nunez-Iglesias

7:01 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

...

I wouldn't have the keyword be "where", as that collides with the notion of "where" elsewhere in numpy.

Joseph Fox-Rabinovitz

7:15 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Wed, Jul 6, 2016 at 3:01 PM, Juan Nunez-Iglesias <jni.soma@gmail.com> wrote:

...

Joseph Fox-Rabinovitz

July 2016

7:20 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing <efiring@hawaii.edu> wrote:

...

Benjamin Root

7:30 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

...

Sebastian Berg

8:34 a.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote:

...

I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you

...

Joseph Fox-Rabinovitz

1:11 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

...

On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote:

...
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you

You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course).

...

However, I have my doubts that it is actually easier to understand then to write yourself ;).

A dictionary or ragged list would be better for that: either {1: (1, 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the index in the list is the starting ndim - 1.

...

- Sebastian

...
don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.

On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz <jfoxrabinovitz @gmail.com> wrote:

...
...
On 2016/07/06 8:25 AM, Benjamin Root wrote:

...
I wouldn't have the keyword be "where", as that collides with

On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing <efiring@hawaii.edu> wrote: the notion

...
...
of "where" elsewhere in numpy.

Agreed. Maybe "side"?

I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.

-Joe

...
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)

Eric

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Joseph Fox-Rabinovitz

2:41 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

...

On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

...
On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote:

...
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you

You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course).

I was originally thinking (-1, 0) for the 2D case. Just go along the list and fill as many dims as necessary. Your way is much better since it does not require a different operation for positive and negative indices.

...
However, I have my doubts that it is actually easier to understand then to write yourself ;).

A dictionary or ragged list would be better for that: either {1: (1, 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the index in the list is the starting ndim - 1.

...
- Sebastian

...
don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.

On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz <jfoxrabinovitz @gmail.com> wrote:

...
...
On 2016/07/06 8:25 AM, Benjamin Root wrote:

...
I wouldn't have the keyword be "where", as that collides with

On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing <efiring@hawaii.edu> wrote: the notion

...
...
of "where" elsewhere in numpy.

Agreed. Maybe "side"?

I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.

-Joe

...
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)

Eric

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Joseph Fox-Rabinovitz

October 2016

6:29 p.m.

New subject: Added atleast_nd, request for clarification/cleanup of atleast_3d

...

I would like to follow up on my original PR (7804). While there appears to be some debate as to whether the PR is numpy material to begin with, there do not appear to be any technical issues with it. To make the decision more straightforward, I factored out the non-controversial bug fixes to masked arrays into PR #7823, along with their regression tests. This way, the original enhancement can be closed or left hanging indefinitely, (even though I hope neither happens). PR 7804 still has the bug fixes duplicated in it.

Regards,

-Joe

On Thu, Jul 7, 2016 at 9:11 AM, Joseph Fox-Rabinovitz <jfoxrabinovitz@gmail.com> wrote:

...
On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

...
On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote:

...
I don't see how one could define a spec that would take an arbitrary array of indices at which to place new dimensions. By definition, you

You just give a reordered range, so that (1, 0, 2) would be the current 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, add everything of course).

I was originally thinking (-1, 0) for the 2D case. Just go along the list and fill as many dims as necessary. Your way is much better since it does not require a different operation for positive and negative indices.

...
However, I have my doubts that it is actually easier to understand then to write yourself ;).

A dictionary or ragged list would be better for that: either {1: (1, 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the index in the list is the starting ndim - 1.

...
- Sebastian

...
don't know how many dimensions are going to be added. If you knew, then you wouldn't be calling this function. I can only imagine simple rules such as 'left' or 'right' or maybe something akin to what at_least3d() implements.

On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz <jfoxrabinovitz @gmail.com> wrote:

...
...
On 2016/07/06 8:25 AM, Benjamin Root wrote: > > I wouldn't have the keyword be "where", as that collides with

On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing <efiring@hawaii.edu> wrote: the notion

...
> of "where" elsewhere in numpy.

Agreed. Maybe "side"?

I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by the starting ndims. I do not think "side" would be appropriate for these extended cases, even if they are very unlikely to ever materialize.

-Joe

...
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)

Eric

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

David

July 2016

10:20 p.m.

New subject: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d

Joseph Fox-Rabinovitz <jfoxrabinovitz <at> gmail.com> writes:

...

On Wed, Jul 6, 2016 at 2:57 PM, Eric

...

...
On 2016/07/06 8:25 AM, Benjamin Root wrote:

...
I wouldn't have the keyword be

"where", as that collides with the notion

...
of "where" elsewhere in numpy.

Agreed. Maybe "side"?

I have tentatively changed it to "pos". The reason that I don't like "side" is that it implies only a subset of the possible ways that that the position of the new dimensions can be specified. The current implementation only puts things on one side or the other, but I have considered also allowing an array of indices at which to place new dimensions, and/or a dictionary keyed by

...

think "side" would be appropriate for

Firing <efiring <at> hawaii.edu> wrote: the starting ndims. I do not these extended cases, even if

...

they are very unlikely to ever materialize.

-Joe

...
(I find atleast_1d and atleast_2d to be very helpful for handling inputs, as Ben noted; I'm skeptical as to the value of atleast_3d and atleast_nd.)

Eric

_____

...

...
NumPy-Discussion mailing list NumPy-Discussion <at> scipy.org

https://mail.scipy.org/mailman/listinfo/nu mpy-discussion

About `order='C'` or `order='F'` for the argument name?

3034

Age (days ago)

3147

Last active (days ago)

List overview

Download

25 comments

11 participants

participants (11)

Benjamin Root
David
Eric Firing
josef.pktd＠gmail.com
Joseph Fox-Rabinovitz
Juan Nunez-Iglesias
Marten van Kerkwijk
Nathaniel Smith
Ralf Gommers
Sebastian Berg
Stephan Hoyer

Added atleast_nd, request for clarification/cleanup of atleast_3d

Marten van Kerkwijk

Juan Nunez-Iglesias

Juan Nunez-Iglesias

David

Marten van Kerkwijk

Juan Nunez-Iglesias

Juan Nunez-Iglesias

David

tags

participants (11)