[Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

Andrew Collette andrew.collette at gmail.com
Mon Jan 7 11:33:48 EST 2013


Hi Matthew,

> I realized when I thought about it, that I did not have a clear idea
> of your exact use case.  How does the user specify the thing to add,
> and why do you need to avoid an error in the case that adding would
> overflow the type?  Would you mind giving an idiot-level explanation?

There isn't a specific use case I had in mind... from a developer's
perspective, what bothers me about the proposed behavior is that every
use of "+" on user-generated input becomes a time bomb.  Since h5py
deals with user-generated files, I have to deal with all kinds of
dtypes, including low-precision ones like int8/uint8.  They come from
user-supplied function and methods arguments, sure, but also from
datasets in files; attributes; virtually everywhere.

I suppose what I'm really asking is that numpy provides (continues to
provide) a default rule in this situation, as does every other
scientific language I've used.  One reason to avoid a ValueError in
favor of default behavior (in addition to the large amount of work
required to check every use of "+") is so there's an established
behavior users know to expect.

For example, one feature we're thinking of implementing involves
adding an offset to a dataset when it's read.  Should we roll over?
Upcast?  It seems to me there's great value in being able to say "We
do what numpy does."  If numpy doesn't answer the question, everybody
makes up their own rules.  There are certainly cases where the answer
is obvious to the application: you have a huge number of int8's and
don't want to upcast.  Or you don't want to lose precision.  But if
numpy provides a default rule, nobody is prevented from making careful
choices based on their application's requirements, and there's the
additional value of having an common, documented default behavior.

Andrew



More information about the NumPy-Discussion mailing list