[Numpy-discussion] Odd numerical difference between Numpy 1.5.1 and Numpy > 1.5.1

Tue Apr 12 00:43:34 EDT 2011

On Mon, Apr 11, 2011 at 8:48 PM, Travis Oliphant <oliphant at enthought.com>wrote:

>
> On Apr 11, 2011, at 3:55 PM, Charles R Harris wrote:
> <snip>
> I agree with Charles.   Let's take the needed time and work this through.
> This is the sort of thing I was a bit nervous about with the changes made to
> the casting rules.    Right now, I'm not confident that the scalar-array
> casting rules are being handled correctly.
>

I would suggest viewing this as having exposed a problem in the NumPy test
suite. All necessary type promotion behaviors should have tests for them,
and I added tests since before 1.6 there were none. This case, where the
scalar type has a greater "kind" than the array type, was missed and had no
preexisting tests. If the test suite is thorough, you can be confident that
the scalar-array casting rules are being handled right.

Here's where to add more tests. Both np.add and np.result_type are validated
with the same set of tests.

https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_numeric.py#L345

> From your explanation, I don't understand how you intend to solve the
> problem.   Surely there is a better approach than special casing the binary
> op and this work-around flag (and why is the result_type even part of the
> discussion)?
>

Yes, there's definitely a better approach, but I don't see one without
changing the ufunc API. In other libraries and programming languages, type
promotion is defined with a set of explicit promotion rules, but since the
current ufuncs have a list of functions and do linear searches in those
lists to find a matching loop, defining and implementing such an explicit
rule set is more difficult.

The reason to bring the numpy.result_type function into it is that it
cleanly encapsulates the NumPy type-promotion rules. A good design needs a
single, consistent way to handle type promotion, so that it can be validated
to be correct in a single place. I created the numpy.promote_types,
numpy.min_scalar_type, and numpy.result_type functions for this purpose, and
the numpy.nditer uses this API to determine the output type when it
allocates a requested array for output. By fixing result_type, then having
the ufunc binary operations use it, we can be confident that the type
promotion will be consistent.

It would be good to see a simple test case and understand why the boolean
> multiplied by the scalar double is becoming a float16.     In other words,
>  why does
>
> (1-test)*t
>
> return a float16 array
>
> This does not sound right at all and it would be good to understand why
> this occurs, now.   How are you handling scalars multiplied by arrays in
> general?
>

The reason it's float16 is that the first function in the multiply function
list for which both types can be safely cast to the output type, after
applying the min_scalar_type function to the scalars, is float16. The change
I'm suggesting is to remove the min_scalar_type part if the "kind" of the
scalar type is higher than the "kind" of the array type. Maybe it can be
done without switching the ufunc to use result_type in 1.6, but result_type
needs to be fixed as well to make the nditer and anything else that wants to
follow the ufunc rules be consistent.

I created a ticket for the issue here:
http://projects.scipy.org/numpy/ticket/1798

Cheers,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110411/6ffc3ac6/attachment.html>