Mailman 3 Weird upcast behavior with 1.6.x, working as intended? - NumPy-Discussion

newer
iterate over multiple arrays

Weird upcast behavior with 1.6.x, working as intended?

older
datetime64 y2k38 bug

Olivier Delalleau

Aug. 8, 2011

4:54 p.m.

Hi, This is with numpy 1.6.1 under Linux x86_64, testing the upcast mechanism of "scalar + array":

...

Since it has to upcast my array (float32 is not "compatible enough" with complex128), why does it upcast it to complex64 instead of complex128? As far as I can tell 1.4.x and 1.5.x versions of numpy are indeed upcasting to complex128. Thanks, -=- Olivier

Attachments:

attachment.htm (text/html — 548 bytes)

Show replies by date

Charles R Harris

August 2011

5:24 p.m.

On Mon, Aug 8, 2011 at 10:54 AM, Olivier Delalleau <shish@keba.be> wrote:

...

The 0 dimensional array is being treated as a scalar, hence is cast to the type of the 1d array. This seems more consistent with the idea that 0 dimensional arrays act like scalars, but I suppose that is open to discussion. Chuck

Olivier Delalleau

7:38 p.m.

2011/8/8 Charles R Harris <charlesr.harris@gmail.com>

...

I'm afraid I don't understand your reply. I know that the 0d array is a scalar, and thus should not lead to an upcast "unless the scalar is of a fundamentally different kind of data (*i.e.*, under a different hierarchy in the data-type hierarchy) than the array" (quoted from http://docs.scipy.org/doc/numpy/reference/ufuncs.html). This is one case where it is under a different hierarchy and thus should trigger an upcast. What I don't understand it why it upcasts to complex64 instead of complex128. Note that: 1. When replacing "numpy.ones" with "numpy.array" it yields complex128 (expected upcast of scalar addition of complex128 with float32) 2. The behavior is similar if instead of "3" I use a number which cannot be represented exactly with a complex64 (so it's not a rule about picking the smallest data type able to exactly represent the result) -=- Olivier

Olivier Delalleau

September 2011

8:52 p.m.

NB: I opened a ticket (http://projects.scipy.org/numpy/ticket/1949) about this, in case it would help getting some attention on this issue. Besides this, I've been experimenting with the cast mechanisms of mixed scalar / array operations in numpy 1.6.1 on a Linux x86_64 architecture, and I can't make sense out of the current behavior. Here are some experiments adding a two-element array to a scalar (both of integer types): (1) [0 0] (int8) + 0 (int32) -> [0 0] (int8) (2) [0 0] (int8) + 127 (int32) -> [127 127] (int16) (3) [0 0] (int8) + -128 (int32) -> [-128 -128] (int8) (4) [0 0] (int8) + 2147483647 (int32) -> [2147483647 2147483647] (int32) (5) [1 1] (int8) + 127 (int32) -> [128 128] (int16) (6) [1 1] (int8) + 2147483647 (int32) -> [-2147483648 -2147483648] (int32) (7) [127 127] (int8) + 1 (int32) -> [-128 -128] (int8) (8) [127 127] (int8) + 127 (int32) -> [254 254] (int16) Here are some examples of things that confuse me: - Output dtype in (2) is int16 while in (3) it is int8, although both results can be written as int8 - Adding a number that would cause an overflow causes the output dtype to be upgraded to a dtype that can hold the result in (5), but not in (6) - Adding a small int32 in (7) that causes an overflow makes it keep the base int8 dtype, but a bigger int32 (although still representable as an int8) in (8) makes it switch to int16 (if someone wonders, adding 126 instead of 127 in (8) would result in [-3 -3] (int8), so 127 is special for some reason). My feeling is actually that the logic is to try to downcast the scalar as much as possible without changing its value, but with a bug that 127 is not downcasted to int8, and remains int16 (!). Some more behavior that puzzles me, this time comparing + vs -: (9) [0 0] (uint32) + -1 (int32) -> [-1 -1] (int64) (10) [0 0] (uint32) - 1 (int32) -> [4294967295 4294967295] (uint32) Here I would expect that adding -1 would be the same as subtracting 1, but that is not the case. Is there anyone with intimate knowledge of the numpy casting behavior for mixed scalar / array operations who could explain what are the rules governing it? Thanks, -=- Olivier

Mark Wiebe

11:21 p.m.

On Fri, Sep 23, 2011 at 1:52 PM, Olivier Delalleau <shish@keba.be> wrote:

...

NB: I opened a ticket (http://projects.scipy.org/numpy/ticket/1949) about this, in case it would help getting some attention on this issue.

A lot of what you're seeing here is due to changes I did for 1.6. I generally made the casting mechanism symmetric (before it could give different types depending on the order of the input arguments), and added a little bit of value-based casting for scalars to reduce some of the overflow that could happen. Before, it always downcast to the smallest-size type regardless of the value in the scalar.

...

Besides this, I've been experimenting with the cast mechanisms of mixed scalar / array operations in numpy 1.6.1 on a Linux x86_64 architecture, and I can't make sense out of the current behavior. Here are some experiments adding a two-element array to a scalar (both of integer types):

(1) [0 0] (int8) + 0 (int32) -> [0 0] (int8) (2) [0 0] (int8) + 127 (int32) -> [127 127] (int16) (3) [0 0] (int8) + -128 (int32) -> [-128 -128] (int8) (4) [0 0] (int8) + 2147483647 (int32) -> [2147483647 2147483647] (int32) (5) [1 1] (int8) + 127 (int32) -> [128 128] (int16) (6) [1 1] (int8) + 2147483647 (int32) -> [-2147483648 -2147483648] (int32) (7) [127 127] (int8) + 1 (int32) -> [-128 -128] (int8) (8) [127 127] (int8) + 127 (int32) -> [254 254] (int16)

Here are some examples of things that confuse me: - Output dtype in (2) is int16 while in (3) it is int8, although both results can be written as int8

Here would be the cause of it: https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/convert... It should be a <= instead of a <, to include the value 127.

...

- Adding a number that would cause an overflow causes the output dtype to be upgraded to a dtype that can hold the result in (5), but not in (6)

Actually, it's upgraded because of the previous point, not because of the overflow. With the change to <= above, this would produce int8

...

- Adding a small int32 in (7) that causes an overflow makes it keep the base int8 dtype, but a bigger int32 (although still representable as an int8) in (8) makes it switch to int16 (if someone wonders, adding 126 instead of 127 in (8) would result in [-3 -3] (int8), so 127 is special for some reason).

My feeling is actually that the logic is to try to downcast the scalar as much as possible without changing its value, but with a bug that 127 is not downcasted to int8, and remains int16 (!).

Some more behavior that puzzles me, this time comparing + vs -: (9) [0 0] (uint32) + -1 (int32) -> [-1 -1] (int64) (10) [0 0] (uint32) - 1 (int32) -> [4294967295 4294967295] (uint32)

Here I would expect that adding -1 would be the same as subtracting 1, but that is not the case.

In the second case, it's equivalent to np.subtract(np.array([0, 0], np.uint32), np.int32(1)). The scalar 1 fits into the uint32, so the result type of the subtraction is uint32. In the first case, the scalar -1 does not fit into the uint32, so it is upgraded to int64.

...

Is there anyone with intimate knowledge of the numpy casting behavior for mixed scalar / array operations who could explain what are the rules governing it?

Hopefully my explanations help a bit. I think this situation is less than ideal, and it would be better to do something more automatic, like doing an up-conversion on overflow. This would more closely emulate Python's behavior of integers never overflowing, at least until 64 bits. This kind of change would be a fair bit of work, and would likely reduce the performance of NumPy slightly. Cheers, Mark

...

Thanks,

-=- Olivier

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Olivier Delalleau

October 2011

3:58 a.m.

2011/9/30 Mark Wiebe <mwwiebe@gmail.com>

...

On Fri, Sep 23, 2011 at 1:52 PM, Olivier Delalleau <shish@keba.be> wrote:

...
NB: I opened a ticket (http://projects.scipy.org/numpy/ticket/1949) about this, in case it would help getting some attention on this issue.

A lot of what you're seeing here is due to changes I did for 1.6. I generally made the casting mechanism symmetric (before it could give different types depending on the order of the input arguments), and added a little bit of value-based casting for scalars to reduce some of the overflow that could happen. Before, it always downcast to the smallest-size type regardless of the value in the scalar.

...
Besides this, I've been experimenting with the cast mechanisms of mixed scalar / array operations in numpy 1.6.1 on a Linux x86_64 architecture, and I can't make sense out of the current behavior. Here are some experiments adding a two-element array to a scalar (both of integer types):

(1) [0 0] (int8) + 0 (int32) -> [0 0] (int8) (2) [0 0] (int8) + 127 (int32) -> [127 127] (int16) (3) [0 0] (int8) + -128 (int32) -> [-128 -128] (int8) (4) [0 0] (int8) + 2147483647 (int32) -> [2147483647 2147483647] (int32) (5) [1 1] (int8) + 127 (int32) -> [128 128] (int16) (6) [1 1] (int8) + 2147483647 (int32) -> [-2147483648 -2147483648] (int32) (7) [127 127] (int8) + 1 (int32) -> [-128 -128] (int8) (8) [127 127] (int8) + 127 (int32) -> [254 254] (int16)

Here are some examples of things that confuse me: - Output dtype in (2) is int16 while in (3) it is int8, although both results can be written as int8

Here would be the cause of it:

https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/convert...

It should be a <= instead of a <, to include the value 127.

...
- Adding a number that would cause an overflow causes the output dtype to be upgraded to a dtype that can hold the result in (5), but not in (6)

Actually, it's upgraded because of the previous point, not because of the overflow. With the change to <= above, this would produce int8

...
- Adding a small int32 in (7) that causes an overflow makes it keep the base int8 dtype, but a bigger int32 (although still representable as an int8) in (8) makes it switch to int16 (if someone wonders, adding 126 instead of 127 in (8) would result in [-3 -3] (int8), so 127 is special for some reason).

My feeling is actually that the logic is to try to downcast the scalar as much as possible without changing its value, but with a bug that 127 is not downcasted to int8, and remains int16 (!).

Some more behavior that puzzles me, this time comparing + vs -: (9) [0 0] (uint32) + -1 (int32) -> [-1 -1] (int64) (10) [0 0] (uint32) - 1 (int32) -> [4294967295 4294967295] (uint32)

Here I would expect that adding -1 would be the same as subtracting 1, but that is not the case.

In the second case, it's equivalent to np.subtract(np.array([0, 0], np.uint32), np.int32(1)). The scalar 1 fits into the uint32, so the result type of the subtraction is uint32. In the first case, the scalar -1 does not fit into the uint32, so it is upgraded to int64.

...
Is there anyone with intimate knowledge of the numpy casting behavior for mixed scalar / array operations who could explain what are the rules governing it?

Hopefully my explanations help a bit. I think this situation is less than ideal, and it would be better to do something more automatic, like doing an up-conversion on overflow. This would more closely emulate Python's behavior of integers never overflowing, at least until 64 bits. This kind of change would be a fair bit of work, and would likely reduce the performance of NumPy slightly.

Cheers, Mark

Thanks! It's re-assuring to hear that part of it is caused by a bug, and the other part has some logic behind it (even though it leads to surprising results). I appreciate you taking the time to clear it up for me :) -=- Olivier

Charles R Harris

August 2011

5:24 p.m.

On Mon, Aug 8, 2011 at 10:54 AM, Olivier Delalleau <shish@keba.be> wrote:

...

Olivier Delalleau

7:38 p.m.

2011/8/8 Charles R Harris <charlesr.harris@gmail.com>

...

Olivier Delalleau

September 2011

8:52 p.m.

Mark Wiebe

11:21 p.m.

On Fri, Sep 23, 2011 at 1:52 PM, Olivier Delalleau <shish@keba.be> wrote:

...

NB: I opened a ticket (http://projects.scipy.org/numpy/ticket/1949) about this, in case it would help getting some attention on this issue.

...

Besides this, I've been experimenting with the cast mechanisms of mixed scalar / array operations in numpy 1.6.1 on a Linux x86_64 architecture, and I can't make sense out of the current behavior. Here are some experiments adding a two-element array to a scalar (both of integer types):

(1) [0 0] (int8) + 0 (int32) -> [0 0] (int8) (2) [0 0] (int8) + 127 (int32) -> [127 127] (int16) (3) [0 0] (int8) + -128 (int32) -> [-128 -128] (int8) (4) [0 0] (int8) + 2147483647 (int32) -> [2147483647 2147483647] (int32) (5) [1 1] (int8) + 127 (int32) -> [128 128] (int16) (6) [1 1] (int8) + 2147483647 (int32) -> [-2147483648 -2147483648] (int32) (7) [127 127] (int8) + 1 (int32) -> [-128 -128] (int8) (8) [127 127] (int8) + 127 (int32) -> [254 254] (int16)

Here are some examples of things that confuse me: - Output dtype in (2) is int16 while in (3) it is int8, although both results can be written as int8

Here would be the cause of it: https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/convert... It should be a <= instead of a <, to include the value 127.

...

- Adding a number that would cause an overflow causes the output dtype to be upgraded to a dtype that can hold the result in (5), but not in (6)

Actually, it's upgraded because of the previous point, not because of the overflow. With the change to <= above, this would produce int8

...

- Adding a small int32 in (7) that causes an overflow makes it keep the base int8 dtype, but a bigger int32 (although still representable as an int8) in (8) makes it switch to int16 (if someone wonders, adding 126 instead of 127 in (8) would result in [-3 -3] (int8), so 127 is special for some reason).

My feeling is actually that the logic is to try to downcast the scalar as much as possible without changing its value, but with a bug that 127 is not downcasted to int8, and remains int16 (!).

Some more behavior that puzzles me, this time comparing + vs -: (9) [0 0] (uint32) + -1 (int32) -> [-1 -1] (int64) (10) [0 0] (uint32) - 1 (int32) -> [4294967295 4294967295] (uint32)

Here I would expect that adding -1 would be the same as subtracting 1, but that is not the case.

...

Is there anyone with intimate knowledge of the numpy casting behavior for mixed scalar / array operations who could explain what are the rules governing it?

...

Thanks,

-=- Olivier

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Olivier Delalleau

October 2011

3:58 a.m.

2011/9/30 Mark Wiebe <mwwiebe@gmail.com>

...

On Fri, Sep 23, 2011 at 1:52 PM, Olivier Delalleau <shish@keba.be> wrote:

...
NB: I opened a ticket (http://projects.scipy.org/numpy/ticket/1949) about this, in case it would help getting some attention on this issue.

A lot of what you're seeing here is due to changes I did for 1.6. I generally made the casting mechanism symmetric (before it could give different types depending on the order of the input arguments), and added a little bit of value-based casting for scalars to reduce some of the overflow that could happen. Before, it always downcast to the smallest-size type regardless of the value in the scalar.

...
Besides this, I've been experimenting with the cast mechanisms of mixed scalar / array operations in numpy 1.6.1 on a Linux x86_64 architecture, and I can't make sense out of the current behavior. Here are some experiments adding a two-element array to a scalar (both of integer types):

(1) [0 0] (int8) + 0 (int32) -> [0 0] (int8) (2) [0 0] (int8) + 127 (int32) -> [127 127] (int16) (3) [0 0] (int8) + -128 (int32) -> [-128 -128] (int8) (4) [0 0] (int8) + 2147483647 (int32) -> [2147483647 2147483647] (int32) (5) [1 1] (int8) + 127 (int32) -> [128 128] (int16) (6) [1 1] (int8) + 2147483647 (int32) -> [-2147483648 -2147483648] (int32) (7) [127 127] (int8) + 1 (int32) -> [-128 -128] (int8) (8) [127 127] (int8) + 127 (int32) -> [254 254] (int16)

Here are some examples of things that confuse me: - Output dtype in (2) is int16 while in (3) it is int8, although both results can be written as int8

Here would be the cause of it:

https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/convert...

It should be a <= instead of a <, to include the value 127.

...
- Adding a number that would cause an overflow causes the output dtype to be upgraded to a dtype that can hold the result in (5), but not in (6)

Actually, it's upgraded because of the previous point, not because of the overflow. With the change to <= above, this would produce int8

...
- Adding a small int32 in (7) that causes an overflow makes it keep the base int8 dtype, but a bigger int32 (although still representable as an int8) in (8) makes it switch to int16 (if someone wonders, adding 126 instead of 127 in (8) would result in [-3 -3] (int8), so 127 is special for some reason).

My feeling is actually that the logic is to try to downcast the scalar as much as possible without changing its value, but with a bug that 127 is not downcasted to int8, and remains int16 (!).

Some more behavior that puzzles me, this time comparing + vs -: (9) [0 0] (uint32) + -1 (int32) -> [-1 -1] (int64) (10) [0 0] (uint32) - 1 (int32) -> [4294967295 4294967295] (uint32)

Here I would expect that adding -1 would be the same as subtracting 1, but that is not the case.

In the second case, it's equivalent to np.subtract(np.array([0, 0], np.uint32), np.int32(1)). The scalar 1 fits into the uint32, so the result type of the subtraction is uint32. In the first case, the scalar -1 does not fit into the uint32, so it is upgraded to int64.

...
Is there anyone with intimate knowledge of the numpy casting behavior for mixed scalar / array operations who could explain what are the rules governing it?

Hopefully my explanations help a bit. I think this situation is less than ideal, and it would be better to do something more automatic, like doing an up-conversion on overflow. This would more closely emulate Python's behavior of integers never overflowing, at least until 64 bits. This kind of change would be a fair bit of work, and would likely reduce the performance of NumPy slightly.

Cheers, Mark

4887

Age (days ago)

4941

Last active (days ago)

List overview

Download

5 comments

3 participants

participants (3)

Charles R Harris
Mark Wiebe
Olivier Delalleau

Weird upcast behavior with 1.6.x, working as intended?

Olivier Delalleau

Charles R Harris

Olivier Delalleau

Olivier Delalleau

Mark Wiebe

Olivier Delalleau

Charles R Harris

Olivier Delalleau

Olivier Delalleau

Mark Wiebe

Olivier Delalleau

tags

participants (3)