[Numpy-discussion] Nasty bug using pre-initialized arrays

Fri Jan 4 20:42:16 EST 2008

On Fri, Jan 04, 2008 at 04:31:53PM -0700, Timothy Hochberg wrote:
> On Jan 4, 2008 3:28 PM, Scott Ransom <sransom at nrao.edu> wrote:
> 
> > On Friday 04 January 2008 05:17:56 pm Stuart Brorson wrote:
> > > >> I realize NumPy != Matlab, but I'd wager that most users would
> > > >> think that this is the natural behavior......
> > > >
> > > > Well, that behavior won't happen. We won't mutate the dtype of the
> > > > array because of assignment. Matlab has copy(-on-write) semantics
> > > > for things like slices while we have view semantics. We can't
> > > > safely do the reallocation of memory [1].
> > >
> > > That's fair enough.  But then I think NumPy should consistently
> > > typecheck all assignmetns and throw an exception if the user attempts
> > > an assignment which looses information.
> > >
> > > If you point me to a file where assignments are done (particularly
> > > from array elements to array elements) I can see if I can figure out
> > > how to fix it & then submit a patch.  But I won't promise anything!
> > > My brain hurts already after analyzing this "feature".....   :-)
> >
> > There is a long history in numeric/numarray/numpy about this "feature".
> > And for many of us, it really is a feature -- it prevents the automatic
> > upcasting of arrays, which is very important if your arrays are huge
> > (i.e. comparable in size to your system memory).
> >
> > For instance in astronomy, where very large 16-bit integer or 32-bit
> > float images or data-cubes are common, if you upcast your 32-bit floats
> > accidentally because you are doing double precision math (i.e. the
> > default in Python) near them, that can cause the program to swap out or
> > die horribly.  In fact, this exact example is one of the reasons why
> > the Space Telescope people initially developed numarray.  numpy has
> > kept that model.  I agree, though, that when using very mixed types
> > (i.e. complex and ints, for example), the results can be confusing.
> >
> 
> This isn't a very compelling argument in this case. The concern the numarray
> people were addressing was the upcasting of precision. However, there are
> two related hierarchies in numpy, one is the kind[1] of data, roughly: bool,
> int, float, complex. Each kind has various precisions. The numarray folks
> were concerned with avoiding upcasting of precision, not with avoiding
> upcasting up kinds. And, I can't see much (any?) justification for allowing
> automagic downcasting of kind, complex->float being the most egregious,
> other than backwards compatibility. This is clearly an opportunity for
> confusion and likely a magnet for bug. And, I've yet to see any useful
> examples to support this behaviour. I imagine that their are some benifits,
> but I doubt that they are compelling enough to justify the current
> behaviour.

I wasn't at all arguing that having complex data chopped and
downcast into an int or float container was the right thing to do.
What I was trying to address was that preventing automatic
upcasting of the "kind" of data that you have is often very useful
and in fact was one of the driving reasons behind numarray.

For this particular complex-type example I think that an exception
would be the proper thing, since if you really want to throw away
the imaginary part it is easy enough to specify x.real.

But the float->int situation is different.  I can see good reasons
for both upcast and downcast.  This is one of those situations
where the programmer had better be sure that they know what they
are doing (and being explicit in the code would be better than
implicit).

Scott

-- 
Scott M. Ransom            Address:  NRAO
Phone:  (434) 296-0320               520 Edgemont Rd.
email:  sransom at nrao.edu             Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989