
Not exactly sure if this should be a bug or not. This came up in a fairly general function of mine to process satellite data. Unexpectedly, one of the satellite files had no scans in it, triggering an exception when I tried to reshape the data from it.
So, if I know all of the dimensions, I can reshape just fine. But if I wanted to use the nifty -1 semantic, it completely falls apart. I can see arguments going either way for whether this is a bug or not. Thoughts? Ben Root

On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote:
I think if there is a simple logic (like using 1 for all zeros in both input and output shape for the -1 calculation), maybe we could do it. I would like someone to think about it carefully that it would not also allow some unexpected generalizations. And at least I am getting a BrainOutOfResourcesError right now trying to figure that out :). - Sebastian

I'd be more than happy to write up the patch. I don't think it would be quite like make zeros be ones, but it would be along those lines. One case I need to wrap my head around is to make sure that a 0 would happen if the following was true:
a = np.ones((0, 5*64)) a.shape = (-1, 5, 64)
EDIT: Just tried the above, and it works as expected (zero in the first dim)! Just tried out a couple of other combos:
This is looking more and more like a bug to me. Ben Root On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

On Di, 2016-02-23 at 14:57 -0500, Benjamin Root wrote:
Seems right to me on first sight :). (I don't like shape assignments though, who cares for one extra view). Well, maybe 1 instead of 0 (ignore 0s), but if the result for -1 is to use 1 and the shape is 0 convert the 1 back to 0. But it is starting to sound a bit tricky, though I think it might be straight forward (i.e. no real traps and when it works it always is what you expect). The main point is, whether you can design cases where the conversion back to 0 hides bugs by not failing when it should. And whether that would be a tradeoff we are willing to accept. - Sebastian

On Tue, Feb 23, 2016 at 8:45 AM, Benjamin Root <ben.v.root@gmail.com> wrote:
Sure, it's totally ambiguous. These are all legal: In [1]: a = np.zeros((0, 5, 64)) In [2]: a.shape = (0, 5 * 64) In [3]: a.shape = (0, 5 * 65) In [4]: a.shape = (0, 5, 102) In [5]: a.shape = (0, 102, 64) Generally, the -1 gets replaced by prod(old_shape) // prod(specified_entries_in_new_shape). If the specified new shape has a 0 in it, then this is a divide-by-zero. In this case it happens because it's the solution to the equation prod((0, 5, 64)) == prod((0, 5, x)) for which there is no unique solution for 'x'. Your proposed solution feels very heuristic-y to me, and heuristics make me very nervous :-/ If what you really want to say is "flatten axes 1 and 2 together", then maybe there should be some API that lets you directly specify *that*? As a bonus you might be able to avoid awkward tuple manipulations to compute the new shape. -n -- Nathaniel J. Smith -- https://vorpus.org

On Tue, Feb 23, 2016 at 3:14 PM, Nathaniel Smith <njs@pobox.com> wrote:
Sure, it's totally ambiguous. These are all legal:
I would argue that except for the first reshape, all of those should be an error, and that the current algorithm is buggy. This isn't a heuristic. It isn't guessing. It is making the semantics consistent. The fact that I can do: a.shape = (-1, 5, 64) or a.shape = (0, 5, 64) but not a.shape = (0, 5, -1) is totally inconsistent. Ben Root

On Tue, Feb 23, 2016 at 12:23 PM, Benjamin Root <ben.v.root@gmail.com> wrote:
Reshape doesn't care about axes at all; all it cares about is that the number of elements stay the same. E.g. this is also totally legal: np.zeros((12, 5)).reshape((10, 3, 2)) And so are the equivalents np.zeros((12, 5)).reshape((-1, 3, 2)) np.zeros((12, 5)).reshape((10, -1, 2)) np.zeros((12, 5)).reshape((10, 3, -1))
It's certainly annoying and unpleasant, but it follows inevitably from the most natural way of defining the -1 semantics, so I'm not sure I'd say "inconsistent" :-) What should this do? np.zeros((12, 0)).reshape((10, -1, 2)) -n -- Nathaniel J. Smith -- https://vorpus.org

On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote:
I think if there is a simple logic (like using 1 for all zeros in both input and output shape for the -1 calculation), maybe we could do it. I would like someone to think about it carefully that it would not also allow some unexpected generalizations. And at least I am getting a BrainOutOfResourcesError right now trying to figure that out :). - Sebastian

I'd be more than happy to write up the patch. I don't think it would be quite like make zeros be ones, but it would be along those lines. One case I need to wrap my head around is to make sure that a 0 would happen if the following was true:
a = np.ones((0, 5*64)) a.shape = (-1, 5, 64)
EDIT: Just tried the above, and it works as expected (zero in the first dim)! Just tried out a couple of other combos:
This is looking more and more like a bug to me. Ben Root On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

On Di, 2016-02-23 at 14:57 -0500, Benjamin Root wrote:
Seems right to me on first sight :). (I don't like shape assignments though, who cares for one extra view). Well, maybe 1 instead of 0 (ignore 0s), but if the result for -1 is to use 1 and the shape is 0 convert the 1 back to 0. But it is starting to sound a bit tricky, though I think it might be straight forward (i.e. no real traps and when it works it always is what you expect). The main point is, whether you can design cases where the conversion back to 0 hides bugs by not failing when it should. And whether that would be a tradeoff we are willing to accept. - Sebastian

On Tue, Feb 23, 2016 at 8:45 AM, Benjamin Root <ben.v.root@gmail.com> wrote:
Sure, it's totally ambiguous. These are all legal: In [1]: a = np.zeros((0, 5, 64)) In [2]: a.shape = (0, 5 * 64) In [3]: a.shape = (0, 5 * 65) In [4]: a.shape = (0, 5, 102) In [5]: a.shape = (0, 102, 64) Generally, the -1 gets replaced by prod(old_shape) // prod(specified_entries_in_new_shape). If the specified new shape has a 0 in it, then this is a divide-by-zero. In this case it happens because it's the solution to the equation prod((0, 5, 64)) == prod((0, 5, x)) for which there is no unique solution for 'x'. Your proposed solution feels very heuristic-y to me, and heuristics make me very nervous :-/ If what you really want to say is "flatten axes 1 and 2 together", then maybe there should be some API that lets you directly specify *that*? As a bonus you might be able to avoid awkward tuple manipulations to compute the new shape. -n -- Nathaniel J. Smith -- https://vorpus.org

On Tue, Feb 23, 2016 at 3:14 PM, Nathaniel Smith <njs@pobox.com> wrote:
Sure, it's totally ambiguous. These are all legal:
I would argue that except for the first reshape, all of those should be an error, and that the current algorithm is buggy. This isn't a heuristic. It isn't guessing. It is making the semantics consistent. The fact that I can do: a.shape = (-1, 5, 64) or a.shape = (0, 5, 64) but not a.shape = (0, 5, -1) is totally inconsistent. Ben Root

On Tue, Feb 23, 2016 at 12:23 PM, Benjamin Root <ben.v.root@gmail.com> wrote:
Reshape doesn't care about axes at all; all it cares about is that the number of elements stay the same. E.g. this is also totally legal: np.zeros((12, 5)).reshape((10, 3, 2)) And so are the equivalents np.zeros((12, 5)).reshape((-1, 3, 2)) np.zeros((12, 5)).reshape((10, -1, 2)) np.zeros((12, 5)).reshape((10, 3, -1))
It's certainly annoying and unpleasant, but it follows inevitably from the most natural way of defining the -1 semantics, so I'm not sure I'd say "inconsistent" :-) What should this do? np.zeros((12, 0)).reshape((10, -1, 2)) -n -- Nathaniel J. Smith -- https://vorpus.org
participants (4)
-
Benjamin Root
-
Nathaniel Smith
-
Sebastian Berg
-
Warren Weckesser