[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Olivier Delalleau shish at keba.be
Sun Jan 20 21:10:30 EST 2013


2013/1/18 Matthew Brett <matthew.brett at gmail.com>:
> Hi,
>
> On Fri, Jan 18, 2013 at 7:58 PM, Chris Barker - NOAA Federal
> <chris.barker at noaa.gov> wrote:
>> On Fri, Jan 18, 2013 at 4:39 AM, Olivier Delalleau <shish at keba.be> wrote:
>>> Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a écrit :
>>
>>> If you check again the examples in this thread exhibiting surprising /
>>> unexpected behavior, you'll notice most of them are with integers.
>>> The tricky thing about integers is that downcasting can dramatically change
>>> your result. With floats, not so much: you get approximation errors (usually
>>> what you want) and the occasional nan / inf creeping in (usally noticeable).
>>
>> fair enough.
>>
>> However my core argument is that people use non-standard (usually
>> smaller) dtypes for a reason, and it should be hard to accidentally
>> up-cast.
>>
>> This is in contrast with the argument that accidental down-casting can
>> produce incorrect results, and thus it should be hard to accidentally
>> down-cast -- same argument whether the incorrect results are drastic
>> or not....
>>
>> It's really a question of which of these we think should be prioritized.
>
> After thinking about it for a while, it seems to me Olivier's
> suggestion is a good one.
>
> The rule becomes the following:
>
> array + scalar casting is the same as array + array casting except
> array + scalar casting does not upcast floating point precision of the
> array.
>
> Am I right (Chris, Perry?) that this deals with almost all your cases?
>  Meaning that it is upcasting of floats that is the main problem, not
> upcasting of (u)ints?
>
> This rule seems to me not very far from the current 1.6 behavior; it
> upcasts more - but the dtype is now predictable.  It's easy to
> explain.  It avoids the obvious errors that the 1.6 rules were trying
> to avoid.  It doesn't seem too far to stretch to make a distinction
> between rules about range (ints) and rules about precision (float,
> complex).
>
> What do you'all think?

Personally, I think the main issue with my suggestion is that it seems
hard to go there from the current behavior -- without potentially
breaking existing code in non-obvious ways. The main problematic case
I foresee is the typical "small_int_array + 1", which would get
upcasted while it wasn't the case before (neither in 1.5 nor in 1.6).
That's why I think Nathaniel's proposal is more practical.

-=- Olivier



More information about the NumPy-Discussion mailing list