
Hi, On Sun, Jan 20, 2013 at 6:10 PM, Olivier Delalleau <shish@keba.be> wrote:
2013/1/18 Matthew Brett <matthew.brett@gmail.com>:
Hi,
On Fri, Jan 18, 2013 at 7:58 PM, Chris Barker - NOAA Federal <chris.barker@noaa.gov> wrote:
On Fri, Jan 18, 2013 at 4:39 AM, Olivier Delalleau <shish@keba.be> wrote:
Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a écrit :
If you check again the examples in this thread exhibiting surprising / unexpected behavior, you'll notice most of them are with integers. The tricky thing about integers is that downcasting can dramatically change your result. With floats, not so much: you get approximation errors (usually what you want) and the occasional nan / inf creeping in (usally noticeable).
fair enough.
However my core argument is that people use non-standard (usually smaller) dtypes for a reason, and it should be hard to accidentally up-cast.
This is in contrast with the argument that accidental down-casting can produce incorrect results, and thus it should be hard to accidentally down-cast -- same argument whether the incorrect results are drastic or not....
It's really a question of which of these we think should be prioritized.
After thinking about it for a while, it seems to me Olivier's suggestion is a good one.
The rule becomes the following:
array + scalar casting is the same as array + array casting except array + scalar casting does not upcast floating point precision of the array.
Am I right (Chris, Perry?) that this deals with almost all your cases? Meaning that it is upcasting of floats that is the main problem, not upcasting of (u)ints?
This rule seems to me not very far from the current 1.6 behavior; it upcasts more - but the dtype is now predictable. It's easy to explain. It avoids the obvious errors that the 1.6 rules were trying to avoid. It doesn't seem too far to stretch to make a distinction between rules about range (ints) and rules about precision (float, complex).
What do you'all think?
Personally, I think the main issue with my suggestion is that it seems hard to go there from the current behavior -- without potentially breaking existing code in non-obvious ways. The main problematic case I foresee is the typical "small_int_array + 1", which would get upcasted while it wasn't the case before (neither in 1.5 nor in 1.6). That's why I think Nathaniel's proposal is more practical.
It's important to establish the behavior we want in the long term, because it will likely affect the stop-gap solution we choose now. For example, let's say we think that the 1.5 behavior is desired in the long term - in that case Nathaniel's solution seems good (although it will change behavior from 1.6.x) If we think that your suggestion is preferable for the long term, sticking with 1.6. behavior is more attractive. It seems to me we need the use-cases laid out properly in order to decide, at the moment we are working somewhat blind, at least in my opinion. Cheers, Matthew