[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Tue Nov 13 05:12:40 EST 2012

On Tue, Nov 13, 2012 at 6:13 AM, Benjamin Root <ben.root at ou.edu> wrote:
>
> On Monday, November 12, 2012, Matthew Brett wrote:
>>
>> Hi,
>>
>> On Mon, Nov 12, 2012 at 8:15 PM, Benjamin Root <ben.root at ou.edu> wrote:
>> >
>> >
>> > On Monday, November 12, 2012, Olivier Delalleau wrote:
>> >>
>> >> 2012/11/12 Nathaniel Smith <njs at pobox.com>
>> >>>
>> >>> On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett
>> >>> <matthew.brett at gmail.com>
>> >>> wrote:
>> >>> > Hi,
>> >>> >
>> >>> > I wanted to check that everyone knows about and is happy with the
>> >>> > scalar casting changes from 1.6.0.
>> >>> >
>> >>> > Specifically, the rules for (array, scalar) casting have changed
>> >>> > such
>> >>> > that the resulting dtype depends on the _value_ of the scalar.
>> >>> >
>> >>> > Mark W has documented these changes here:
>> >>> >
>> >>> > http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
>> >>> >
>> >>> >
>> >>> > http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
>> >>> >
>> >>> >
>> >>> > http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
>> >>> >
>> >>> > Specifically, as of 1.6.0:
>> >>> >
>> >>> > In [19]: arr = np.array([1.], dtype=np.float32)
>> >>> >
>> >>> > In [20]: (arr + (2**16-1)).dtype
>> >>> > Out[20]: dtype('float32')
>> >>> >
>> >>> > In [21]: (arr + (2**16)).dtype
>> >>> > Out[21]: dtype('float64')
>> >>> >
>> >>> > In [25]: arr = np.array([1.], dtype=np.int8)
>> >>> >
>> >>> > In [26]: (arr + 127).dtype
>> >>> > Out[26]: dtype('int8')
>> >>> >
>> >>> > In [27]: (arr + 128).dtype
>> >>> > Out[27]: dtype('int16')
>> >>> >
>> >>> > There's discussion about the changes here:
>> >>> >
>> >>> >
>> >>> >
>> >>> > http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
>> >>> >
>> >>> > http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
>> >>> >
>> >>> >
>> >>> > http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
>> >>> >
>> >>> > It seems to me that this change is hard to explain, and does what
>> >>> > you
>> >>> > want only some of the time, making it a false friend.
>> >>>
>> >>> The old behaviour was that in these cases, the scalar was always cast
>> >>> to the type of the array, right? So
>> >>>   np.array([1], dtype=np.int8) + 256
>> >>> returned 1? Is that the behaviour you prefer?
>> >>>
>> >>> I agree that the 1.6 behaviour is surprising and somewhat
>> >>> inconsistent. There are many places where you can get an overflow in
>> >>> numpy, and in all the other cases we just let the overflow happen. And
>> >>> in fact you can still get an overflow with arr + scalar operations, so
>> >>> this doesn't really fix anything.
>> >>>
>> >>> I find the specific handling of unsigned -> signed and float32 ->
>> >>> float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
>> >>> representable as a float32, but it doesn't *overflow*, it just gives
>> >>> you 2.0**16... if I'm using float32 then I presumably don't care that
>> >>> much about exact representability, so it's surprising that numpy is
>> >>> working to enforce it, and definitely a separate decision from what to
>> >>> do about overflow.)
>> >>>
>> >>> None of those threads seem to really get into the question of what the
>> >>> best behaviour here *is*, though.
>> >>>
>> >>> Possibly the moWell, hold on though, I was asking earlier in the
>> >>> thread what we
>>
>> thought the behavior should be in 2.0 or maybe better put, sometime in
>> the future.
>>
>> If we know what we think the best answer is, and we think the best
>> answer is worth shooting for, then we can try to think of sensible
>> ways of getting there.
>>
>> I guess that's what Nathaniel and Olivier were thinking of but they
>> can correct me if I'm wrong...
>>
>> Cheers,
>>
>> Matthew
>
>
> I am fine with migrating to better solutions (I have yet to decide on this
> current situation, though), but whatever change is adopted must go through a
> deprecation process, which was my point.  Outright breaking of code as a
> first step is the wrong choice, and I was merely nipping it in the bud.

Thanks for your vigilance.

Unfortunately in this case AFAICT 1.6 already silently broke people's
code in this rather weird case, so that will presumably affect
whatever migration strategy we go with, but yes, the goal is to end up
in the right place and to get there in the right way...

-n