[Numpy-discussion] Moving forward with value based casting
Sebastian Berg
sebastian at sipsolutions.net
Thu Jun 6 11:43:44 EDT 2019
On Wed, 2019-06-05 at 21:35 -0400, Marten van Kerkwijk wrote:
> Hi Sebastian,
>
> Tricky! It seems a balance between unexpected memory blow-up and
> unexpected wrapping (the latter mostly for integers).
>
> Some comments specifically on your message first, then some more
> general related ones.
>
> 1. I'm very much against letting `a + b` do anything else than
> `np.add(a, b)`.
Well, I tend to agree. But just to put it out there:
[1] + [2] == [1, 2]
np.add([1], [2]) == 3
So that is already far from true, since coercion has to occur. Of
course it is true that:
arr + something_else
will at some point force coercion of `something_else`, so that point is
only half valid if either `a` or `b` is already a numpy array/scalar.
> 2. For python values, an argument for casting by value is that a
> python int can be arbitrarily long; the only reasonable course of
> action for those seems to make them float, and once you do that one
> might as well cast to whatever type can hold the value (at least
> approximately).
To be honest, the "arbitrary long" thing is another issue, which is the
silent conversion to "object" dtype. Something that is also on the not
done list of: Maybe we should deprecate it.
In other words, we would freeze python int to one clear type, if you
have an arbitrarily large int, you would need to use `object` dtype (or
preferably a new `pyint/arbitrary_precision_int` dtype) explicitly.
> 3. Not necessarily preferred, but for casting of scalars, one can get
> more consistent behaviour also by extending the casting by value to
> any array that has size=1.
>
That sounds just as horrible as the current mismatch to me, to be
honest.
> Overall, just on the narrow question, I'd be quite happy with your
> suggestion of using type information if available, i.e., only cast
> python values to a minimal dtype.If one uses numpy types, those
> mostly will have come from previous calculations with the same
> arrays, so things will work as expected. And in most memory-limited
> applications, one would do calculations in-place anyway (or, as Tyler
> noted, for power users one can assume awareness of memory and thus
> the incentive to tell explicitly what dtype is wanted - just
> `np.add(a, b, dtype=...)`, no need to create `out`).
>
> More generally, I guess what I don't like about the casting rules
> generally is that there is a presumption that if the value can be
> cast, the operation will generally succeed. For `np.add` and
> `np.subtract`, this perhaps is somewhat reasonable (though for
> unsigned a bit more dubious), but for `np.multiply` or `np.power` it
> is much less so. (Indeed, we had a long discussion about what to do
> with `int ** power` - now special-casing negative integer powers.)
> Changing this, however, probably really is a bridge too far!
Indeed that is right. But that is a different point. E.g. there is
nothing wrong for example that `np.power` shouldn't decide that
`int**power` should always _promote_ (not cast) `int` to some larger
integer type if available.
The only point where we seriously have such logic right now is for
np.add.reduce (sum) and np.multiply.reduce (prod), which always use at
least `long` precision (and actually upcast bool->int, although
np.add(True, True) does not. Another difference to True + True...)
>
> Finally, somewhat related: I think the largest confusing actually
> results from the `uint64+in64 -> float64` casting. Should this cast
> to int64 instead?
Not sure, but yes, it is the other quirk in our casting that should be
discussed….
- Sebastian
>
> All the best,
>
> Marten
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190606/8e00c9cb/attachment.sig>
More information about the NumPy-Discussion
mailing list