On Tue, Apr 12, 2011 at 8:24 AM, Robert Kern <robert.kern@gmail.com> wrote:
On Mon, Apr 11, 2011 at 23:43, Mark Wiebe <mwwiebe@gmail.com> wrote:
> On Mon, Apr 11, 2011 at 8:48 PM, Travis Oliphant <oliphant@enthought.com>
> wrote:

>> It would be good to see a simple test case and understand why the boolean
>> multiplied by the scalar double is becoming a float16.     In other words,
>>  why does
>> (1-test)*t
>> return a float16 array
>> This does not sound right at all and it would be good to understand why
>> this occurs, now.   How are you handling scalars multiplied by arrays in
>> general?
> The reason it's float16 is that the first function in the multiply function
> list for which both types can be safely cast to the output type,

Except that float64 cannot be safely cast to float16.

That's true, but it was already being done this way with respect to float32. Rereading the documentation for min_scalar_type, I see the explanation could elaborate on the purpose of the function further. Float64 cannot be safely cast to float32, but this is what NumPy does:

>>> import numpy as np
>>> np.__version__
>>> np.float64(3.5) * np.ones(2,dtype=np.float32)
array([ 3.5,  3.5], dtype=float32)

> after
> applying the min_scalar_type function to the scalars, is float16.

This is implemented incorrectly, then. It makes no sense for floats,
for which the limiting attribute is precision, not range. For floats,
the result of min_scalar_type should be the type of the object itself,
nothing else. E.g. min_scalar_type(x)==float64 if type(x) is float no
matter what value it has.

I believe this behavior is necessary to avoid undesirable promotion of arrays to float64. If you are working with extremely large float32 arrays, and multiply or add a Python float, you would always have to say np.float32(value), something rather tedious. Looking at the archives, I see you've explained this before. ;)


The re-implementation of type casting, other than for the case at hand which I consider to be a wrong behavior, follows the existing pattern as closely as I could understand it from the previous implementation. I tightened the rules just slightly to avoid some problematic downcasts which previously were occurring:

>>> np.__version__
>>> 100000*np.ones(2,dtype=np.int8)
array([-31072, -31072], dtype=int16)
>>> 1e60*np.ones(2,dtype=np.float32)
array([ Inf,  Inf], dtype=float32)

>>> np.__version__
>>> 100000*np.ones(2,dtype=np.int8)
array([100000, 100000], dtype=int32)
>>> 1e60*np.ones(2,dtype=np.float32)
array([  1.00000000e+60,   1.00000000e+60])