[Numpy-discussion] Integers to integer powers, let's make a decision

Fri Jun 10 20:28:30 EDT 2016

On 06/10/2016 03:38 PM, Alan Isaac wrote:
>>>> np.find_common_type([np.int8],[np.int32])
> dtype('int8')
>>>> (np.arange(10,dtype=np.int8)+np.int32(2**10)).dtype
> dtype('int16')
> 
> And so on.  If these other binary operators upcast based
> on the scalar value, why wouldn't exponentiation?
> I suppose the answer is: they upcast only insofar
> as necessary to fit the scalar value, which I see is
> a simple and enforceable rule.  However, that seems the wrong
> rule for exponentiation, and in fact it is not in play:
> 
>>>> (np.int8(2)**2).dtype
> dtype('int32')

My understanding is that numpy never upcasts based on the values, it
upcasts based on the datatype ranges.

http://docs.scipy.org/doc/numpy-1.10.1/reference/ufuncs.html#casting-rules

For arrays of different datatype, numpy finds the datatype which can
store values in both dtype's ranges, *not* the type which is large
enough to accurately store the result values.

So for instance,

    >>> (np.arange(10, dtype=np.uint8) + np.uint32(2**32-1)).dtype
    dtype('uint32')

Overflow has occurred, but numpy didn't upcast to uint64.

This rule has some slightly strange consequences. For example, the
ranges of np.int8 and np.uint64 don't match up, and numpy has decided
that the only type covering both ranges is np.float64.

So as an extra twist in this discussion, this means numpy actually
*does* return a float value for an integer power in a few cases:

    >>> type( np.uint64(2) ** np.int8(3) )
    numpy.float64

> OK, my question to those who have argued a**2 should
> produce an int32 when a is an int32: what if a is an int8?
> (Obviously the overflow problem is becoming extremely pressing ...)

To me, whether it's int8 or int32, the user should just be aware of
overflow.

Also, I like to think of numpy as having quite C-like behavior, allowing
you to play with the lowlevel bits and bytes. (I actually wish its
casting behavior was more C-like). I suspect that people working with
uint8 arrays might be doing byte-fiddling hacks and actually *want*
overflow/wraparound to occur, at least when multiplying/adding.

Allan

PS
I would concede that numpy's uint8 integer power currently doesn't
wraparound like mutliply does, but it would be cool if it did.
(modulo arithmetic is associative, so it should, right?).

    >>> x = np.arange(256, dtype='uint8')
    >>> x**8             # returns all 0
    >>> x*x*x*x*x*x*x*x  # returns wrapped values