Mailman 3 reshaping empty array bug? - NumPy-Discussion

reshaping empty array bug?

Benjamin Root

Feb. 23, 2016

11:32 a.m.

Not exactly sure if this should be a bug or not. This came up in a fairly general function of mine to process satellite data. Unexpectedly, one of the satellite files had no scans in it, triggering an exception when I tried to reshape the data from it.

...

So, if I know all of the dimensions, I can reshape just fine. But if I wanted to use the nifty -1 semantic, it completely falls apart. I can see arguments going either way for whether this is a bug or not. Thoughts? Ben Root

Attachments:

attachment.htm (text/html — 979 bytes)

Show replies by date

Warren Weckesser

February 2016

11:41 a.m.

On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root <ben.v.root@gmail.com> wrote:

...

When you try `a.shape = (0, 5, -1)`, the size of the third dimension is ambiguous. From the Zen of Python: "In the face of ambiguity, refuse the temptation to guess." Warren

...

Benjamin Root

11:45 a.m.

but, it isn't really ambiguous, is it? The -1 can only refer to a single dimension, and if you ignore the zeros in the original and new shape, the -1 is easily solvable, right? Ben Root On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < warren.weckesser@gmail.com> wrote:

...

Sebastian Berg

1:58 p.m.

On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote:

...

I think if there is a simple logic (like using 1 for all zeros in both input and output shape for the -1 calculation), maybe we could do it. I would like someone to think about it carefully that it would not also allow some unexpected generalizations. And at least I am getting a BrainOutOfResourcesError right now trying to figure that out :). - Sebastian

...

Benjamin Root

2:57 p.m.

I'd be more than happy to write up the patch. I don't think it would be quite like make zeros be ones, but it would be along those lines. One case I need to wrap my head around is to make sure that a 0 would happen if the following was true:

...

...
...
a = np.ones((0, 5*64)) a.shape = (-1, 5, 64)

EDIT: Just tried the above, and it works as expected (zero in the first dim)! Just tried out a couple of other combos:

...

...
...
a.shape = (-1,) a.shape (0,) a.shape = (-1, 5, 64) a.shape (0, 5, 64)

This is looking more and more like a bug to me. Ben Root On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

...

On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote:

...
but, it isn't really ambiguous, is it? The -1 can only refer to a single dimension, and if you ignore the zeros in the original and new shape, the -1 is easily solvable, right?

I think if there is a simple logic (like using 1 for all zeros in both input and output shape for the -1 calculation), maybe we could do it. I would like someone to think about it carefully that it would not also allow some unexpected generalizations. And at least I am getting a BrainOutOfResourcesError right now trying to figure that out :).

- Sebastian

...
Ben Root

On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < warren.weckesser@gmail.com> wrote:

...
On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root < ben.v.root@gmail.com> wrote:

...
Not exactly sure if this should be a bug or not. This came up in a fairly general function of mine to process satellite data. Unexpectedly, one of the satellite files had no scans in it, triggering an exception when I tried to reshape the data from it.

...
...
> import numpy as np > a = np.zeros((0, 5*64)) > a.shape (0, 320) > a.shape = (0, 5, 64) > a.shape (0, 5, 64) > a.shape = (0, 5*64) > a.shape = (0, 5, -1) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: total size of new array must be unchanged

So, if I know all of the dimensions, I can reshape just fine. But if I wanted to use the nifty -1 semantic, it completely falls apart. I can see arguments going either way for whether this is a bug or not.

When you try `a.shape = (0, 5, -1)`, the size of the third dimension is ambiguous. From the Zen of Python: "In the face of ambiguity, refuse the temptation to guess."

Warren

...
Thoughts?

Ben Root

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Sebastian Berg

3:06 p.m.

On Di, 2016-02-23 at 14:57 -0500, Benjamin Root wrote:

...

Seems right to me on first sight :). (I don't like shape assignments though, who cares for one extra view). Well, maybe 1 instead of 0 (ignore 0s), but if the result for -1 is to use 1 and the shape is 0 convert the 1 back to 0. But it is starting to sound a bit tricky, though I think it might be straight forward (i.e. no real traps and when it works it always is what you expect). The main point is, whether you can design cases where the conversion back to 0 hides bugs by not failing when it should. And whether that would be a tradeoff we are willing to accept. - Sebastian

...

This is looking more and more like a bug to me.

Ben Root

On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg < sebastian@sipsolutions.net> wrote:

...
On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote:

...
but, it isn't really ambiguous, is it? The -1 can only refer to a single dimension, and if you ignore the zeros in the original and new shape, the -1 is easily solvable, right?

I think if there is a simple logic (like using 1 for all zeros in both input and output shape for the -1 calculation), maybe we could do it. I would like someone to think about it carefully that it would not also allow some unexpected generalizations. And at least I am getting a BrainOutOfResourcesError right now trying to figure that out :).

- Sebastian

...
Ben Root

On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < warren.weckesser@gmail.com> wrote:

...
On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root < ben.v.root@gmail.com> wrote:

...
Not exactly sure if this should be a bug or not. This came up

in

...
...
a fairly general function of mine to process satellite data. Unexpectedly, one of the satellite files had no scans in it, triggering an exception when I tried to reshape the data from it.

...
>> import numpy as np >> a = np.zeros((0, 5*64)) >> a.shape (0, 320) >> a.shape = (0, 5, 64) >> a.shape (0, 5, 64) >> a.shape = (0, 5*64) >> a.shape = (0, 5, -1) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: total size of new array must be unchanged

So, if I know all of the dimensions, I can reshape just fine. But if I wanted to use the nifty -1 semantic, it completely falls apart. I can see arguments going either way for whether this is a bug or not.

When you try `a.shape = (0, 5, -1)`, the size of the third dimension is ambiguous. From the Zen of Python: "In the face of ambiguity, refuse the temptation to guess."

Warren

...
Thoughts?

Ben Root

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Sebastian Berg

3:14 p.m.

On Di, 2016-02-23 at 21:06 +0100, Sebastian Berg wrote:

...

Another thought. Maybe you can figure out the -1 correctly, if there is no *other* 0 involved. If there is any other 0, I could imagine problems.

...

- Sebastian

...
This is looking more and more like a bug to me.

Ben Root

On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg < sebastian@sipsolutions.net> wrote:

...
On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote:

...
but, it isn't really ambiguous, is it? The -1 can only refer to a single dimension, and if you ignore the zeros in the original and new shape, the -1 is easily solvable, right?

I think if there is a simple logic (like using 1 for all zeros in both input and output shape for the -1 calculation), maybe we could do it. I would like someone to think about it carefully that it would not also allow some unexpected generalizations. And at least I am getting a BrainOutOfResourcesError right now trying to figure that out :).

- Sebastian

...
Ben Root

On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < warren.weckesser@gmail.com> wrote:

...
On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root < ben.v.root@gmail.com> wrote:

...
Not exactly sure if this should be a bug or not. This came up

in

...
...
a fairly general function of mine to process satellite data. Unexpectedly, one of the satellite files had no scans in it, triggering an exception when I tried to reshape the data from it.

> > > import numpy as np > > > a = np.zeros((0, 5*64)) > > > a.shape (0, 320) > > > a.shape = (0, 5, 64) > > > a.shape (0, 5, 64) > > > a.shape = (0, 5*64) > > > a.shape = (0, 5, -1) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: total size of new array must be unchanged

So, if I know all of the dimensions, I can reshape just fine. But if I wanted to use the nifty -1 semantic, it completely falls apart. I can see arguments going either way for whether this is a bug or not.

When you try `a.shape = (0, 5, -1)`, the size of the third dimension is ambiguous. From the Zen of Python: "In the face of ambiguity, refuse the temptation to guess."

Warren

...
Thoughts?

Ben Root

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Nathaniel Smith

3:14 p.m.

On Tue, Feb 23, 2016 at 8:45 AM, Benjamin Root <ben.v.root@gmail.com> wrote:

...

Sure, it's totally ambiguous. These are all legal: In [1]: a = np.zeros((0, 5, 64)) In [2]: a.shape = (0, 5 * 64) In [3]: a.shape = (0, 5 * 65) In [4]: a.shape = (0, 5, 102) In [5]: a.shape = (0, 102, 64) Generally, the -1 gets replaced by prod(old_shape) // prod(specified_entries_in_new_shape). If the specified new shape has a 0 in it, then this is a divide-by-zero. In this case it happens because it's the solution to the equation prod((0, 5, 64)) == prod((0, 5, x)) for which there is no unique solution for 'x'. Your proposed solution feels very heuristic-y to me, and heuristics make me very nervous :-/ If what you really want to say is "flatten axes 1 and 2 together", then maybe there should be some API that lets you directly specify *that*? As a bonus you might be able to avoid awkward tuple manipulations to compute the new shape. -n -- Nathaniel J. Smith -- https://vorpus.org

Benjamin Root

3:23 p.m.

On Tue, Feb 23, 2016 at 3:14 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

Sure, it's totally ambiguous. These are all legal:

I would argue that except for the first reshape, all of those should be an error, and that the current algorithm is buggy. This isn't a heuristic. It isn't guessing. It is making the semantics consistent. The fact that I can do: a.shape = (-1, 5, 64) or a.shape = (0, 5, 64) but not a.shape = (0, 5, -1) is totally inconsistent. Ben Root

Nathaniel Smith

3:30 p.m.

On Tue, Feb 23, 2016 at 12:23 PM, Benjamin Root <ben.v.root@gmail.com> wrote:

...

Reshape doesn't care about axes at all; all it cares about is that the number of elements stay the same. E.g. this is also totally legal: np.zeros((12, 5)).reshape((10, 3, 2)) And so are the equivalents np.zeros((12, 5)).reshape((-1, 3, 2)) np.zeros((12, 5)).reshape((10, -1, 2)) np.zeros((12, 5)).reshape((10, 3, -1))

...

It's certainly annoying and unpleasant, but it follows inevitably from the most natural way of defining the -1 semantics, so I'm not sure I'd say "inconsistent" :-) What should this do? np.zeros((12, 0)).reshape((10, -1, 2)) -n -- Nathaniel J. Smith -- https://vorpus.org

Benjamin Root

3:50 p.m.

On Tue, Feb 23, 2016 at 3:30 PM, Nathaniel Smith <njs@pobox.com> wrote:

...

What should this do?

np.zeros((12, 0)).reshape((10, -1, 2))

It should error out, I already covered that. 12 != 20. Ben Root

Warren Weckesser

February 2016

11:41 a.m.

On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root <ben.v.root@gmail.com> wrote:

...

When you try `a.shape = (0, 5, -1)`, the size of the third dimension is ambiguous. From the Zen of Python: "In the face of ambiguity, refuse the temptation to guess." Warren

...

Benjamin Root

11:45 a.m.

...

Sebastian Berg

1:58 p.m.

On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote:

...

Benjamin Root

2:57 p.m.

...

...
...
a = np.ones((0, 5*64)) a.shape = (-1, 5, 64)

EDIT: Just tried the above, and it works as expected (zero in the first dim)! Just tried out a couple of other combos:

...

...
...
a.shape = (-1,) a.shape (0,) a.shape = (-1, 5, 64) a.shape (0, 5, 64)

This is looking more and more like a bug to me. Ben Root On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

...

On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote:

...
but, it isn't really ambiguous, is it? The -1 can only refer to a single dimension, and if you ignore the zeros in the original and new shape, the -1 is easily solvable, right?

I think if there is a simple logic (like using 1 for all zeros in both input and output shape for the -1 calculation), maybe we could do it. I would like someone to think about it carefully that it would not also allow some unexpected generalizations. And at least I am getting a BrainOutOfResourcesError right now trying to figure that out :).

- Sebastian

...
Ben Root

On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < warren.weckesser@gmail.com> wrote:

...
On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root < ben.v.root@gmail.com> wrote:

...
Not exactly sure if this should be a bug or not. This came up in a fairly general function of mine to process satellite data. Unexpectedly, one of the satellite files had no scans in it, triggering an exception when I tried to reshape the data from it.

...
...
> import numpy as np > a = np.zeros((0, 5*64)) > a.shape (0, 320) > a.shape = (0, 5, 64) > a.shape (0, 5, 64) > a.shape = (0, 5*64) > a.shape = (0, 5, -1) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: total size of new array must be unchanged

So, if I know all of the dimensions, I can reshape just fine. But if I wanted to use the nifty -1 semantic, it completely falls apart. I can see arguments going either way for whether this is a bug or not.

When you try `a.shape = (0, 5, -1)`, the size of the third dimension is ambiguous. From the Zen of Python: "In the face of ambiguity, refuse the temptation to guess."

Warren

...
Thoughts?

Ben Root

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Sebastian Berg

3:06 p.m.

On Di, 2016-02-23 at 14:57 -0500, Benjamin Root wrote:

...

This is looking more and more like a bug to me.

Ben Root

On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg < sebastian@sipsolutions.net> wrote:

...
On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote:

...
but, it isn't really ambiguous, is it? The -1 can only refer to a single dimension, and if you ignore the zeros in the original and new shape, the -1 is easily solvable, right?

I think if there is a simple logic (like using 1 for all zeros in both input and output shape for the -1 calculation), maybe we could do it. I would like someone to think about it carefully that it would not also allow some unexpected generalizations. And at least I am getting a BrainOutOfResourcesError right now trying to figure that out :).

- Sebastian

...
Ben Root

On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < warren.weckesser@gmail.com> wrote:

...
On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root < ben.v.root@gmail.com> wrote:

...
Not exactly sure if this should be a bug or not. This came up

in

...
...
a fairly general function of mine to process satellite data. Unexpectedly, one of the satellite files had no scans in it, triggering an exception when I tried to reshape the data from it.

...
>> import numpy as np >> a = np.zeros((0, 5*64)) >> a.shape (0, 320) >> a.shape = (0, 5, 64) >> a.shape (0, 5, 64) >> a.shape = (0, 5*64) >> a.shape = (0, 5, -1) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: total size of new array must be unchanged

So, if I know all of the dimensions, I can reshape just fine. But if I wanted to use the nifty -1 semantic, it completely falls apart. I can see arguments going either way for whether this is a bug or not.

When you try `a.shape = (0, 5, -1)`, the size of the third dimension is ambiguous. From the Zen of Python: "In the face of ambiguity, refuse the temptation to guess."

Warren

...
Thoughts?

Ben Root

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

Sebastian Berg

3:14 p.m.

On Di, 2016-02-23 at 21:06 +0100, Sebastian Berg wrote:

...

Another thought. Maybe you can figure out the -1 correctly, if there is no *other* 0 involved. If there is any other 0, I could imagine problems.

...

- Sebastian

...
This is looking more and more like a bug to me.

Ben Root

On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg < sebastian@sipsolutions.net> wrote:

...
On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote:

...
but, it isn't really ambiguous, is it? The -1 can only refer to a single dimension, and if you ignore the zeros in the original and new shape, the -1 is easily solvable, right?

I think if there is a simple logic (like using 1 for all zeros in both input and output shape for the -1 calculation), maybe we could do it. I would like someone to think about it carefully that it would not also allow some unexpected generalizations. And at least I am getting a BrainOutOfResourcesError right now trying to figure that out :).

- Sebastian

...
Ben Root

On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < warren.weckesser@gmail.com> wrote:

...
On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root < ben.v.root@gmail.com> wrote:

...
Not exactly sure if this should be a bug or not. This came up

in

...
...
a fairly general function of mine to process satellite data. Unexpectedly, one of the satellite files had no scans in it, triggering an exception when I tried to reshape the data from it.

> > > import numpy as np > > > a = np.zeros((0, 5*64)) > > > a.shape (0, 320) > > > a.shape = (0, 5, 64) > > > a.shape (0, 5, 64) > > > a.shape = (0, 5*64) > > > a.shape = (0, 5, -1) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: total size of new array must be unchanged

So, if I know all of the dimensions, I can reshape just fine. But if I wanted to use the nifty -1 semantic, it completely falls apart. I can see arguments going either way for whether this is a bug or not.

When you try `a.shape = (0, 5, -1)`, the size of the third dimension is ambiguous. From the Zen of Python: "In the face of ambiguity, refuse the temptation to guess."

Warren

...
Thoughts?

Ben Root

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion