Weird upcast behavior with 1.6.x, working as intended?
Hi, This is with numpy 1.6.1 under Linux x86_64, testing the upcast mechanism of "scalar + array":
import numpy; print (numpy.array(3, dtype=numpy.complex128) + numpy.ones(3, dtype=numpy.float32)).dtype complex64
Since it has to upcast my array (float32 is not "compatible enough" with complex128), why does it upcast it to complex64 instead of complex128? As far as I can tell 1.4.x and 1.5.x versions of numpy are indeed upcasting to complex128. Thanks, = Olivier
On Mon, Aug 8, 2011 at 10:54 AM, Olivier Delalleau <shish@keba.be> wrote:
Hi,
This is with numpy 1.6.1 under Linux x86_64, testing the upcast mechanism of "scalar + array":
import numpy; print (numpy.array(3, dtype=numpy.complex128) + numpy.ones(3, dtype=numpy.float32)).dtype complex64
Since it has to upcast my array (float32 is not "compatible enough" with complex128), why does it upcast it to complex64 instead of complex128? As far as I can tell 1.4.x and 1.5.x versions of numpy are indeed upcasting to complex128.
The 0 dimensional array is being treated as a scalar, hence is cast to the type of the 1d array. This seems more consistent with the idea that 0 dimensional arrays act like scalars, but I suppose that is open to discussion. Chuck
2011/8/8 Charles R Harris <charlesr.harris@gmail.com>
On Mon, Aug 8, 2011 at 10:54 AM, Olivier Delalleau <shish@keba.be> wrote:
Hi,
This is with numpy 1.6.1 under Linux x86_64, testing the upcast mechanism of "scalar + array":
import numpy; print (numpy.array(3, dtype=numpy.complex128) + numpy.ones(3, dtype=numpy.float32)).dtype complex64
Since it has to upcast my array (float32 is not "compatible enough" with complex128), why does it upcast it to complex64 instead of complex128? As far as I can tell 1.4.x and 1.5.x versions of numpy are indeed upcasting to complex128.
The 0 dimensional array is being treated as a scalar, hence is cast to the type of the 1d array. This seems more consistent with the idea that 0 dimensional arrays act like scalars, but I suppose that is open to discussion.
Chuck
I'm afraid I don't understand your reply. I know that the 0d array is a scalar, and thus should not lead to an upcast "unless the scalar is of a fundamentally different kind of data (*i.e.*, under a different hierarchy in the datatype hierarchy) than the array" (quoted from http://docs.scipy.org/doc/numpy/reference/ufuncs.html). This is one case where it is under a different hierarchy and thus should trigger an upcast. What I don't understand it why it upcasts to complex64 instead of complex128. Note that: 1. When replacing "numpy.ones" with "numpy.array" it yields complex128 (expected upcast of scalar addition of complex128 with float32) 2. The behavior is similar if instead of "3" I use a number which cannot be represented exactly with a complex64 (so it's not a rule about picking the smallest data type able to exactly represent the result) = Olivier
NB: I opened a ticket (http://projects.scipy.org/numpy/ticket/1949) about this, in case it would help getting some attention on this issue. Besides this, I've been experimenting with the cast mechanisms of mixed scalar / array operations in numpy 1.6.1 on a Linux x86_64 architecture, and I can't make sense out of the current behavior. Here are some experiments adding a twoelement array to a scalar (both of integer types): (1) [0 0] (int8) + 0 (int32) > [0 0] (int8) (2) [0 0] (int8) + 127 (int32) > [127 127] (int16) (3) [0 0] (int8) + 128 (int32) > [128 128] (int8) (4) [0 0] (int8) + 2147483647 (int32) > [2147483647 2147483647] (int32) (5) [1 1] (int8) + 127 (int32) > [128 128] (int16) (6) [1 1] (int8) + 2147483647 (int32) > [2147483648 2147483648] (int32) (7) [127 127] (int8) + 1 (int32) > [128 128] (int8) (8) [127 127] (int8) + 127 (int32) > [254 254] (int16) Here are some examples of things that confuse me:  Output dtype in (2) is int16 while in (3) it is int8, although both results can be written as int8  Adding a number that would cause an overflow causes the output dtype to be upgraded to a dtype that can hold the result in (5), but not in (6)  Adding a small int32 in (7) that causes an overflow makes it keep the base int8 dtype, but a bigger int32 (although still representable as an int8) in (8) makes it switch to int16 (if someone wonders, adding 126 instead of 127 in (8) would result in [3 3] (int8), so 127 is special for some reason). My feeling is actually that the logic is to try to downcast the scalar as much as possible without changing its value, but with a bug that 127 is not downcasted to int8, and remains int16 (!). Some more behavior that puzzles me, this time comparing + vs : (9) [0 0] (uint32) + 1 (int32) > [1 1] (int64) (10) [0 0] (uint32)  1 (int32) > [4294967295 4294967295] (uint32) Here I would expect that adding 1 would be the same as subtracting 1, but that is not the case. Is there anyone with intimate knowledge of the numpy casting behavior for mixed scalar / array operations who could explain what are the rules governing it? Thanks, = Olivier
On Fri, Sep 23, 2011 at 1:52 PM, Olivier Delalleau <shish@keba.be> wrote:
NB: I opened a ticket (http://projects.scipy.org/numpy/ticket/1949) about this, in case it would help getting some attention on this issue.
A lot of what you're seeing here is due to changes I did for 1.6. I generally made the casting mechanism symmetric (before it could give different types depending on the order of the input arguments), and added a little bit of valuebased casting for scalars to reduce some of the overflow that could happen. Before, it always downcast to the smallestsize type regardless of the value in the scalar.
Besides this, I've been experimenting with the cast mechanisms of mixed scalar / array operations in numpy 1.6.1 on a Linux x86_64 architecture, and I can't make sense out of the current behavior. Here are some experiments adding a twoelement array to a scalar (both of integer types):
(1) [0 0] (int8) + 0 (int32) > [0 0] (int8) (2) [0 0] (int8) + 127 (int32) > [127 127] (int16) (3) [0 0] (int8) + 128 (int32) > [128 128] (int8) (4) [0 0] (int8) + 2147483647 (int32) > [2147483647 2147483647] (int32) (5) [1 1] (int8) + 127 (int32) > [128 128] (int16) (6) [1 1] (int8) + 2147483647 (int32) > [2147483648 2147483648] (int32) (7) [127 127] (int8) + 1 (int32) > [128 128] (int8) (8) [127 127] (int8) + 127 (int32) > [254 254] (int16)
Here are some examples of things that confuse me:  Output dtype in (2) is int16 while in (3) it is int8, although both results can be written as int8
Here would be the cause of it: https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/convert... It should be a <= instead of a <, to include the value 127.
 Adding a number that would cause an overflow causes the output dtype to be upgraded to a dtype that can hold the result in (5), but not in (6)
Actually, it's upgraded because of the previous point, not because of the overflow. With the change to <= above, this would produce int8
 Adding a small int32 in (7) that causes an overflow makes it keep the base int8 dtype, but a bigger int32 (although still representable as an int8) in (8) makes it switch to int16 (if someone wonders, adding 126 instead of 127 in (8) would result in [3 3] (int8), so 127 is special for some reason).
My feeling is actually that the logic is to try to downcast the scalar as much as possible without changing its value, but with a bug that 127 is not downcasted to int8, and remains int16 (!).
Some more behavior that puzzles me, this time comparing + vs : (9) [0 0] (uint32) + 1 (int32) > [1 1] (int64) (10) [0 0] (uint32)  1 (int32) > [4294967295 4294967295] (uint32)
Here I would expect that adding 1 would be the same as subtracting 1, but that is not the case.
In the second case, it's equivalent to np.subtract(np.array([0, 0], np.uint32), np.int32(1)). The scalar 1 fits into the uint32, so the result type of the subtraction is uint32. In the first case, the scalar 1 does not fit into the uint32, so it is upgraded to int64.
Is there anyone with intimate knowledge of the numpy casting behavior for mixed scalar / array operations who could explain what are the rules governing it?
Hopefully my explanations help a bit. I think this situation is less than ideal, and it would be better to do something more automatic, like doing an upconversion on overflow. This would more closely emulate Python's behavior of integers never overflowing, at least until 64 bits. This kind of change would be a fair bit of work, and would likely reduce the performance of NumPy slightly. Cheers, Mark
Thanks,
= Olivier
_______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
2011/9/30 Mark Wiebe <mwwiebe@gmail.com>
On Fri, Sep 23, 2011 at 1:52 PM, Olivier Delalleau <shish@keba.be> wrote:
NB: I opened a ticket (http://projects.scipy.org/numpy/ticket/1949) about this, in case it would help getting some attention on this issue.
A lot of what you're seeing here is due to changes I did for 1.6. I generally made the casting mechanism symmetric (before it could give different types depending on the order of the input arguments), and added a little bit of valuebased casting for scalars to reduce some of the overflow that could happen. Before, it always downcast to the smallestsize type regardless of the value in the scalar.
Besides this, I've been experimenting with the cast mechanisms of mixed scalar / array operations in numpy 1.6.1 on a Linux x86_64 architecture, and I can't make sense out of the current behavior. Here are some experiments adding a twoelement array to a scalar (both of integer types):
(1) [0 0] (int8) + 0 (int32) > [0 0] (int8) (2) [0 0] (int8) + 127 (int32) > [127 127] (int16) (3) [0 0] (int8) + 128 (int32) > [128 128] (int8) (4) [0 0] (int8) + 2147483647 (int32) > [2147483647 2147483647] (int32) (5) [1 1] (int8) + 127 (int32) > [128 128] (int16) (6) [1 1] (int8) + 2147483647 (int32) > [2147483648 2147483648] (int32) (7) [127 127] (int8) + 1 (int32) > [128 128] (int8) (8) [127 127] (int8) + 127 (int32) > [254 254] (int16)
Here are some examples of things that confuse me:  Output dtype in (2) is int16 while in (3) it is int8, although both results can be written as int8
Here would be the cause of it:
https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/convert...
It should be a <= instead of a <, to include the value 127.
 Adding a number that would cause an overflow causes the output dtype to be upgraded to a dtype that can hold the result in (5), but not in (6)
Actually, it's upgraded because of the previous point, not because of the overflow. With the change to <= above, this would produce int8
 Adding a small int32 in (7) that causes an overflow makes it keep the base int8 dtype, but a bigger int32 (although still representable as an int8) in (8) makes it switch to int16 (if someone wonders, adding 126 instead of 127 in (8) would result in [3 3] (int8), so 127 is special for some reason).
My feeling is actually that the logic is to try to downcast the scalar as much as possible without changing its value, but with a bug that 127 is not downcasted to int8, and remains int16 (!).
Some more behavior that puzzles me, this time comparing + vs : (9) [0 0] (uint32) + 1 (int32) > [1 1] (int64) (10) [0 0] (uint32)  1 (int32) > [4294967295 4294967295] (uint32)
Here I would expect that adding 1 would be the same as subtracting 1, but that is not the case.
In the second case, it's equivalent to np.subtract(np.array([0, 0], np.uint32), np.int32(1)). The scalar 1 fits into the uint32, so the result type of the subtraction is uint32. In the first case, the scalar 1 does not fit into the uint32, so it is upgraded to int64.
Is there anyone with intimate knowledge of the numpy casting behavior for mixed scalar / array operations who could explain what are the rules governing it?
Hopefully my explanations help a bit. I think this situation is less than ideal, and it would be better to do something more automatic, like doing an upconversion on overflow. This would more closely emulate Python's behavior of integers never overflowing, at least until 64 bits. This kind of change would be a fair bit of work, and would likely reduce the performance of NumPy slightly.
Cheers, Mark
Thanks! It's reassuring to hear that part of it is caused by a bug, and the other part has some logic behind it (even though it leads to surprising results). I appreciate you taking the time to clear it up for me :) = Olivier
participants (3)

Charles R Harris

Mark Wiebe

Olivier Delalleau