FW: [Numpy-discussion] Bug: extremely misleading array behavior

eric jones eric at enthought.com
Mon Jun 10 14:27:04 EDT 2002


> <Eric Jones writes>:
> > I further believe that all Numeric functions (sum, product, etc.)
should
> > return arrays all the time instead of converting implicitly
converting
> > them to Python scalars in special cases such as reductions of 1d
arrays.
> > I think the only reason for the silent conversion is that Python
lists
> > only allow integer values for use in indexing so that:
> >
> >  >>> a = [1,2,3,4]
> >  >>> a[array(0)]
> >  Traceback (most recent call last):
> >    File "<stdin>", line 1, in ?
> >  TypeError: sequence index must be integer
> >
> > Numeric arrays don't have this problem:
> >
> >  >>> a = array([1,2,3,4])
> >  >>> a[array(0)]
> >  1
> >
> > I don't think this alone is a strong enough reason for the
conversion.
> > Getting rid of special cases is more important because it makes
behavior
> > predictable to the novice (and expert), and it is easier to write
> > generic functions and be sure they will not break a year from now
when
> > one of the special cases occurs.
> >
> > Are there other reasons why scalars are returned?
> >
> Well, sure. It isn't just indexing lists directly, it would be
> anywhere in Python that you would use a number. 

Travis seemed to indicate that the Python would convert 0-d arrays to
Python types correctly for most (all?) cases.  Python indexing is a
little unique because it explicitly requires integers. It's not just 0-d
arrays that fail as indexes -- Python floats won't work either.  

As for passing arrays to functions expecting numbers, is it that much
different than passing an integer into a function that does floating
point operations?  Python handles this casting automatically.  It seems
like is should do the same for 0-d arrays if they know how to "look
like" Python types.
 
> In some contexts,
> the right thing may happen (where the function knows to try to obtain
> a simple number from an object), but then again, it may not (if
calling
> a function where the number is used directly to index or slice).
> 
> Here is another case where good arguments can be made for both
> sides. It really isn't an issue of functionality (one can write
> methods or functions to do what is needed), it's what the convenient
> syntax does. For example, if we really want a Python scalar but
> rank-0 arrays are always returned then something like this may
> be required:
> 
> >>> x = arange(10)
> >>> a = range(10)
> >>> a[scalar(x[2])] # instead of a[x[2]]

Yes, this would be required for using them as array indexes.  Or
actually:

 >>> a[int(x[2])]

> 
> Whereas if simple indexing returns a Python scalar and consistency
> is desired in always having arrays returned one may have to do
> something like this
> 
> >>> y = x.indexAsArray(2) # instead of y = x[2]
> 
> or perhaps
> 
> >>> y = x[ArrayAlwaysAsResultIndexObject(2)]
>                # :-) with better name, of course
> 
> One context or the other is going to be inconvenienced, but not
> prevented from doing what is needed.

Right. 

> 
> As long as Python scalars are the 'biggest' type of their kind, we
> strongly lean towards single elements being converted into Python
> scalars. It's our feeling that there are more surprises and gotchas,
> particularly for more casual users, on this side than on the
uncertainty
> of an index returning an array or scalar. People writing code that
> expects to deal with uncertain dimensionality (the only place that
> this occurs) should be the ones to go the extra distance in more
> awkward syntax.

Well, I guess I'd like to figure out exactly what breaks before ruling
it out because consistently returning the same type from
functions/indexing is beneficial.  It becomes even more beneficial with
the exception behavior used by SciPy and numarray.  

The two breakage cases I'm aware of are (1) indexing and (2) functions
that explicitly check for arguments of IntType, DoubleType, or
ComplextType.  When searching the standard library for these guys, they
only turn up in copy, pickle, xmlrpclib, and the types module -- all in
innocuous ways.  Searching for 'float' (which is equal to FloatType)
doesn't show up any code that breaks this either.  A search of my
site-packages had IntType tests used quite a bit -- primarily in SciPy.
Some of these would go away with this change, and many were harmless.  I
saw a few that would need fixing (several in special.py), but the fix
was trivial.   

eric







More information about the NumPy-Discussion mailing list