
On Wed, 2019-06-05 at 21:35 -0400, Marten van Kerkwijk wrote:
Hi Sebastian,
Tricky! It seems a balance between unexpected memory blow-up and unexpected wrapping (the latter mostly for integers).
Some comments specifically on your message first, then some more general related ones.
1. I'm very much against letting `a + b` do anything else than `np.add(a, b)`. 2. For python values, an argument for casting by value is that a python int can be arbitrarily long; the only reasonable course of action for those seems to make them float, and once you do that one might as well cast to whatever type can hold the value (at least approximately).
Just to throw it in, in the long run, instead of trying to find a minimal dtype (which is a bit random), simply ignoring the value of the scalar may actually be the better option. The reason for this would be code like: ``` arr = np.zeros(5, dtype=np.int8) for i in range(200): res = arr + i print(res.dtype) # switches from int8 to int16! ``` Instead, try `np.int8(i)` in the loop, and if it fails raise an error. Or, if that is a bit nasty – especially for interactive usage – we would go with a warning. This is nothing we need to decide soon, since I think some of the complexity will remain (i.e. you still need to know that the scalar is a floating point number or an integer and change the logic). Best, Sebastian
3. Not necessarily preferred, but for casting of scalars, one can get more consistent behaviour also by extending the casting by value to any array that has size=1.
Overall, just on the narrow question, I'd be quite happy with your suggestion of using type information if available, i.e., only cast python values to a minimal dtype.If one uses numpy types, those mostly will have come from previous calculations with the same arrays, so things will work as expected. And in most memory-limited applications, one would do calculations in-place anyway (or, as Tyler noted, for power users one can assume awareness of memory and thus the incentive to tell explicitly what dtype is wanted - just `np.add(a, b, dtype=...)`, no need to create `out`).
More generally, I guess what I don't like about the casting rules generally is that there is a presumption that if the value can be cast, the operation will generally succeed. For `np.add` and `np.subtract`, this perhaps is somewhat reasonable (though for unsigned a bit more dubious), but for `np.multiply` or `np.power` it is much less so. (Indeed, we had a long discussion about what to do with `int ** power` - now special-casing negative integer powers.) Changing this, however, probably really is a bridge too far!
Finally, somewhat related: I think the largest confusing actually results from the `uint64+in64 -> float64` casting. Should this cast to int64 instead?
All the best,
Marten
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion