[Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

Mon Feb 13 23:30:43 EST 2012

On Monday, February 13, 2012, Charles R Harris <charlesr.harris at gmail.com>
wrote:
>
>
> On Mon, Feb 13, 2012 at 9:04 PM, Travis Oliphant <travis at continuum.io>
wrote:
>>
>> I disagree with your assessment of the subscript operator, but I'm sure
we will have plenty of time to discuss that.  I don't think it's correct to
compare  the corner cases of the fancy indexing and regular indexing to the
corner cases of type coercion system.    If you recall, I was quite nervous
about all the changes you made to the coercion rules because I didn't
believe you fully understood what had been done before and I knew there was
not complete test coverage.
>> It is true that both systems have emerged from a long history and could
definitely use fresh perspectives which we all appreciate you and others
bringing.   It is also true that few are aware of the details of how things
are actually implemented and that there are corner cases that are basically
defined by the algorithm used (this is more true of the type-coercion
system than fancy-indexing, however).
>> I think it would have been wise to write those extensive tests prior to
writing new code.   I'm curious if what you were expecting for the output
was derived from what earlier versions of NumPy produced.    NumPy has
never been in a state where you could just re-factor at will and assume
that tests will catch all intended use cases.   Numeric before it was not
in that state either.   This is a good goal, and we always welcome new
tests.    It just takes a lot of time and a lot of tedious work that the
volunteer labor to this point have not had the time to do.
>> Very few of us have ever been paid to work on NumPy directly and have
often been trying to fit in improvements to the code base between other
jobs we are supposed to be doing.    Of course, you and I are hoping to
change that this year and look forward to the code quality improving
commensurately.
>> Thanks for all you are doing.   I also agree that Rolf and Charles
have-been and are invaluable in the maintenance and progress of NumPy and
SciPy.   They deserve as much praise and kudos as anyone can give them.
>
> Well, the typecasting wasn't perfect and, as Mark points out, it wasn't
commutative. The addition of float16 also complicated the picture, and user
types is going to do more in that direction. And I don't see how a new
developer should be responsible for tests enforcing old traditions, the
original developers should be responsible for those. But history is
history, it didn't happen that way, and here we are.
>
> That said, I think we need to show a little flexibility in the corner
cases. And going forward I think that typecasting is going to need a
rethink.
>
> Chuck
>
> On Feb 13, 2012, at 9:40 PM, Mark Wiebe wrote:
>
> I believe the main lessons to draw from this are just how incredibly
important a complete test suite and staying on top of code reviews are. I'm
of the opinion that any explicit design choice of this nature should be
reflected in the test suite, so that if someone changes it years later,
they get immediate feedback that they're breaking something important.
NumPy has gradually increased its test suite coverage, and when I dealt
with the type promotion subsystem, I added fairly extensive tests:
>
https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_numeric.py#L345
> Another subsystem which is in a similar state as what the type promotion
subsystem was, is the subscript operator and how regular/fancy indexing
work. What this means is that any attempt to improve it that doesn't
coincide with the original intent years ago can easily break things that
were originally intended without them being caught by a test. I believe
this subsystem needs improvement, and the transition to new/improved code
will probably be trickier to manage than for the dtype promotion case.
> Let's try to learn from the type promotion case as best we can, and use
it to improve NumPy's process. I believe Charles and Ralph have been doing
a great job of enforcing high standards in new NumPy code, and managing the
release process in a way that has resulted in very few bugs and regressions
in the release. Most of these quality standards are still informal,
however, and it's probably a good idea to write them down in a canonical
location. It will be especially helpful for newcomers, who can treat the
standards as a checklist before submitting pull requests.
> Thanks,
> -Mark
>
> On Mon, Feb 13, 2012 at 7:11 PM, Travis Oliphant <travis at continuum.io>
wrote:
>
> The problem is that these sorts of things take a while to emerge.  The
original system was more consistent than I think you give it credit.  What
you are seeing is that most people get NumPy from distributions and are
relying on us to keep things consistent.
> The scalar coercion rules were deterministic and based on the idea that a
scalar does not determine the output dtype unless it is of a different
kind.   The new code changes that unfortunately.
> Another thing I noticed is that I thought that int16 <op> scalar float
would produce float32 originally.  This seems to have changed, but I need
to check on an older version of NumPy.
> Changing the scalar coercion rules is an unfortunate substantial change
in semantics and should not have happened in the 1.X series.
> I understand you did not get a lot of feedback and spent a lot of time on
the code which we all appreciate.   I worked to stay true to the Numeric
casting rules incorporating the changes to prevent scalar upcasting due to
the absence of single precision Numeric literals in Python.
> We will need to look in detail at what has changed.  I will write a test
to do that.
> Thanks,
> Travis
> --
> Travis Oliphant
> (on a mobile)
> 512-826-7480
>
> On Feb 13, 2012, at 7:58 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>
> On Mon, Feb 13, 2012 at 5:00 PM, Travis Oliphant <travis at continuum.io>
wrote:
>
> Hmmm.   This seems like a regression.  The scalar casting API was fairly
intentional.
>
> What is the reason for the change?
>
> In order to make 1.6 ABI-compatible with 1.5, I basically had to rewrite
this subsystem. There were virtually no tests in the test suite specifying
what the expected behavior should be, and there were clear inconsistencies
where for example
>

I actually remember there being some discussion about the type coercion
changes and those of us involved agreed that the old behavior was
unintuitive and was likely a bug.  Commutibility is important.

Numpy already has one situation where users have to watch out for order
(list times integer gives a new array, while an integer times list throws
an exception), we really shouldn't have to keep worrying about bugs like
these, and the one I just mentioned isn't even numpy's fault.

I have to agree with Mark's changes, and I think he was very open about the
possible impacts.  It just so happened that the ones who reviewed it were
probably newer users.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120213/924087e1/attachment.html>