FW: [Numpy-discussion] Bug: extremely misleading array behavior

Perry Greenfield perry at stsci.edu
Mon Jun 10 13:07:04 EDT 2002

<Eric Jones writes>:
> I further believe that all Numeric functions (sum, product, etc.) should
> return arrays all the time instead of converting implicitly converting
> them to Python scalars in special cases such as reductions of 1d arrays.
> I think the only reason for the silent conversion is that Python lists
> only allow integer values for use in indexing so that:
>  >>> a = [1,2,3,4]
>  >>> a[array(0)]
>  Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
>  TypeError: sequence index must be integer
> Numeric arrays don't have this problem:
>  >>> a = array([1,2,3,4])
>  >>> a[array(0)]
>  1
> I don't think this alone is a strong enough reason for the conversion.
> Getting rid of special cases is more important because it makes behavior
> predictable to the novice (and expert), and it is easier to write
> generic functions and be sure they will not break a year from now when
> one of the special cases occurs.  
> Are there other reasons why scalars are returned?
Well, sure. It isn't just indexing lists directly, it would be
anywhere in Python that you would use a number. In some contexts,
the right thing may happen (where the function knows to try to obtain
a simple number from an object), but then again, it may not (if calling
a function where the number is used directly to index or slice).

Here is another case where good arguments can be made for both
sides. It really isn't an issue of functionality (one can write
methods or functions to do what is needed), it's what the convenient
syntax does. For example, if we really want a Python scalar but
rank-0 arrays are always returned then something like this may
be required:

>>> x = arange(10)
>>> a = range(10)
>>> a[scalar(x[2])] # instead of a[x[2]]

Whereas if simple indexing returns a Python scalar and consistency
is desired in always having arrays returned one may have to do
something like this

>>> y = x.indexAsArray(2) # instead of y = x[2]

or perhaps

>>> y = x[ArrayAlwaysAsResultIndexObject(2)] 
               # :-) with better name, of course

One context or the other is going to be inconvenienced, but not
prevented from doing what is needed.

As long as Python scalars are the 'biggest' type of their kind, we
strongly lean towards single elements being converted into Python
scalars. It's our feeling that there are more surprises and gotchas,
particularly for more casual users, on this side than on the uncertainty
of an index returning an array or scalar. People writing code that 
expects to deal with uncertain dimensionality (the only place that 
this occurs) should be the ones to go the extra distance in more
awkward syntax.


More information about the NumPy-Discussion mailing list