[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Mon Nov 12 23:15:48 EST 2012

On Monday, November 12, 2012, Olivier Delalleau wrote:

> 2012/11/12 Nathaniel Smith <njs at pobox.com <javascript:_e({}, 'cvml',
> 'njs at pobox.com');>>
>
>> On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett <matthew.brett at gmail.com<javascript:_e({}, 'cvml', 'matthew.brett at gmail.com');>>
>> wrote:
>> > Hi,
>> >
>> > I wanted to check that everyone knows about and is happy with the
>> > scalar casting changes from 1.6.0.
>> >
>> > Specifically, the rules for (array, scalar) casting have changed such
>> > that the resulting dtype depends on the _value_ of the scalar.
>> >
>> > Mark W has documented these changes here:
>> >
>> > http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
>> >
>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
>> >
>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
>> >
>> > Specifically, as of 1.6.0:
>> >
>> > In [19]: arr = np.array([1.], dtype=np.float32)
>> >
>> > In [20]: (arr + (2**16-1)).dtype
>> > Out[20]: dtype('float32')
>> >
>> > In [21]: (arr + (2**16)).dtype
>> > Out[21]: dtype('float64')
>> >
>> > In [25]: arr = np.array([1.], dtype=np.int8)
>> >
>> > In [26]: (arr + 127).dtype
>> > Out[26]: dtype('int8')
>> >
>> > In [27]: (arr + 128).dtype
>> > Out[27]: dtype('int16')
>> >
>> > There's discussion about the changes here:
>> >
>> >
>> http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
>> > http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
>> >
>> http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
>> >
>> > It seems to me that this change is hard to explain, and does what you
>> > want only some of the time, making it a false friend.
>>
>> The old behaviour was that in these cases, the scalar was always cast
>> to the type of the array, right? So
>>   np.array([1], dtype=np.int8) + 256
>> returned 1? Is that the behaviour you prefer?
>>
>> I agree that the 1.6 behaviour is surprising and somewhat
>> inconsistent. There are many places where you can get an overflow in
>> numpy, and in all the other cases we just let the overflow happen. And
>> in fact you can still get an overflow with arr + scalar operations, so
>> this doesn't really fix anything.
>>
>> I find the specific handling of unsigned -> signed and float32 ->
>> float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
>> representable as a float32, but it doesn't *overflow*, it just gives
>> you 2.0**16... if I'm using float32 then I presumably don't care that
>> much about exact representability, so it's surprising that numpy is
>> working to enforce it, and definitely a separate decision from what to
>> do about overflow.)
>>
>> None of those threads seem to really get into the question of what the
>> best behaviour here *is*, though.
>>
>> Possibly the most defensible choice is to treat ufunc(arr, scalar)
>> operations as performing an implicit cast of the scalar to arr's
>> dtype, and using the standard implicit casting rules -- which I think
>> means, raising an error if !can_cast(scalar, arr.dtype,
>> casting="safe")
>
>
> I like this suggestion. It may break some existing code, but I think it'd
> be for the best. The current behavior can be very confusing.
>
> -=- Olivier
>

"break some existing code"

I really should set up an email filter for this phrase and have it send
back an email automatically: "Are you nuts?!"

We just resolved an issue where the "safe" casting rule unexpectedly broke
existing code with regards to unplaced operations.  The solution was to
warn about the change in the upcoming release and to throw errors in a
later release.  Playing around with fundemental things like this need to be
done methodically and carefully.

Cheers!
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20121112/c767bf3b/attachment.html>