[Numpy-discussion] Re: NEP 50: Promotion rules for Python scalars

1 Jun 2022

      On Wed, 2022-06-01 at 20:23 +0200, Ralf Gommers wrote:
...
On Wed, Jun 1, 2022 at 5:51 PM Sebastian Berg
<sebastian@sipsolutions.net>
wrote:
...
An important part of moving forward will be assessing the real
world
impact.  To start that process, I have created a branch as a draft
PR
(at this time):
    https://github.com/numpy/numpy/pull/21626
It is missing some parts, but should allow preliminary testing. The
main missing part is that the integer warnings and errors are less
strict than proposed in the NEP.
It would be invaluable to get a better idea to what extent existing
code, especially end-user code, is affected by the proposed
changes.
Thanks Sebastian! For testing, did you already try with some of the
usual
suspects, or would it be helpful to use this branch on SciPy, Pandas,
etc.?
Also, do you expect it's useful to do platform-specific testing? I
can
imagine there's some Windows-specific behavior; adapting a SciPy CI
job to
work from your branch is easy to do if that would be helpful.
Yes, I have for SciPy.  As noted in the PR, those look "mostly
harmless" on first sight (not that it won't mean quite a bit of work,
but I think it is manageable work).
I would be more scared if there is a need to systematically vet all
places where behavior (may have) changed.

For example, in NumPy:

   np.median(np.float32([1, 2, 3, 4]))

did return a float64 before and will now return a float32.  I assume
because somewhere we write: `(np.float64(3) + np.float32(2)) / 2`.

There a few places that I suspect just need updated test or a bit of
thought.  And at least one or two that need to use the correct integer
types (IIRC `scipy.io.idl` seems to be using some low precision or
unsigned integer type internally and that leads to failures).

I thought pandas would fail much harder, but it seems only had a 150-
200 failures (many probably clustered).  One larger annoyance there is
that one parametrized test runs into an infinite recursion which makes
it run excruciatingly slow.

In any case, I believe that it would be far more helpful if those more
familiar with the libraries have a look at the failures.  Not only do
they know better how much impact they have; it also helps to get a feel
for how painful the transition will be.

One problem I see, is that I still expect that libraries are not the
main issue.
Using a SciPy integrator may end up with a float32 rather than a
float64 result.  In the SciPy test suite, that probably just means
tweaking the test a bit.
But that same change will also break someones script out there,
somewhere.  So the real affected persons (who may occasionally get less
precise/breaking results) are likely the end-users rather than the
libraries.

Cheers,

Sebastian
...
Cheers,
Ralf
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-leave@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: sebastian@sipsolutions.net