[Numpy-discussion] Odd numerical difference between Numpy 1.5.1 and Numpy > 1.5.1

Mark Wiebe mwwiebe at gmail.com
Tue Apr 12 14:51:55 EDT 2011


On Tue, Apr 12, 2011 at 11:17 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

>
>
> On Tue, Apr 12, 2011 at 11:56 AM, Robert Kern <robert.kern at gmail.com>wrote:
>
>> On Tue, Apr 12, 2011 at 12:27, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>>
>> > IIRC, the behavior with respect to scalars sort of happened in the code
>> on
>> > the fly, so this is a good discussion to have. We should end up with
>> > documented rules and tests to enforce them. I agree with Mark that the
>> tests
>> > have been deficient up to this point.
>>
>> It's been documented for a long time now.
>>
>> http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
>>
>>
> Nope, the kind stuff is missing. Note the cast to float32 that Mark pointed
> out. Also that the casting of python integers depends on their sign and
> magnitude.
>
> In [1]: ones(3, '?') + 0
> Out[1]: array([1, 1, 1], dtype=int8)
>
> In [2]: ones(3, '?') + 1000
> Out[2]: array([1001, 1001, 1001], dtype=int16)
>

This is the behaviour with master - it's a good idea to cross-check with an
older NumPy. I think we're discussing 3 things here, what NumPy 1.5 and
earlier did, what NumPy 1.6 beta currently does, and what people think NumPy
did. The old implementation had a bit of a spaghetti-factor to it, and had
problems like asymmetry and silent overflows. The new implementation is in
my opinion cleaner and follows well-defined semantics while trying to stay
true to the old implementation. I admit the documentation I wrote doesn't
fully explain them, but here's the rule for a set of arbitrary arrays (not
necessarily just 2):

- if all the arrays are scalars, do type promotion on the types as is
- otherwise, do type promotion on min_scalar_type(a) of each array a

The function min_scalar_type returns the array type if a has >= 1
dimensions, or the smallest type of the same kind (allowing int->uint in the
case of positive-valued signed integers) to which the value can be cast
without overflow if a has 0 dimensions.

The promote_types function used for the type promotion is symmetric and
associative, so the result won't change when shuffling the inputs. There's a
bit of a wrinkle in the implementation to handle the fact that the uint type
values aren't a strict subset of the same-sized int type values, but
otherwise this is what happens.

https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/convert_datatype.c#L1075

The change I'm proposing is to modify this as follows:

- if all the arrays are scalars, do type promotion on the types as is
- if the maximum kind of all the scalars is > the maximum kind of all the
arrays, do type promotion on the types as is
- otherwise, do type promotion on min_scalar_type(a) of each array a

One case where this may not capture a possible desired semantics is
[complex128 scalar] * [float32 array] -> [complex128]. In this case
[complex64] may be desired. This is directly analogous to the original
[float64 scalar] * [int8 array], however, and in the latter case it's clear
a float64 should result.

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110412/59e2dc7f/attachment.html>


More information about the NumPy-Discussion mailing list