[Numpy-discussion] Notes on transitioning to numarray.

Mon Nov 10 08:28:02 EST 2003

[There is a html version of this not at 
http://starship.python.net/~hochberg/conversion.html]

====================================
Numeric to numarray Conversion Notes
====================================

I finally bit the bullet over the last few days and moved my current
project from using Numeric to using  numarray. This was a reasonably
large undertaking, involving the modification of in excess of thirty
modules. For the most part it went smoothly, requiring only the
replacement of ``import Numeric`` with ``import numarray`` [1]_.
However, in the course of the move I ran into several bugs, as well
as quite a few things that may be bugs or may be deliberate changes
from Numeric. I took some notes as the conversion progressed which I
will attempt to render here in some halfway decipherable form.

There are still some interoperability issues with Numeric and numarray,
not all of which I took the time to track down. In many cases, I solved
problems that cropped up in the conversion simply by converting some more
of the code and thus reducing the amount of mixed operations. The ones that
I did track down are reported below.

I'm using numarray 0.7 with Python 2.3 on Windows XP.

.. [1] Actually, to be strictly accurate, I replaced ``import Numeric as 
np`` with ``import numarray as na`` and
        then replaced ``np`` with ``na``.

Bugs
====

The following few things are almost certainly bugs. I haven't had
time to dig into them in any depth, but I have tried to reduce them
each to a small failing case:

    1. Copying a slice of an array onto a different slice of the same
    array fails.

    >>> y = na.arange(4)
    >>> y[1:] = y[:-1]
    >>> y # Should be array([0, 0, 1, 2])
    array([0, 0, 0, 0])

    2. ``sqrt``, ``Power``, `and ``**`` all fail on complex zero (0j).

    >>> y = na.arange(4) + 0j
    >>> na.sqrt(y)
    Warning: Encountered invalid numeric result(s)  in sqrt
    Warning: Encountered divide by zero(s)  in sqrt
    Warning: Encountered invalid numeric result(s)  in not_equal
    Warning: Encountered invalid numeric result(s)  in not_equal
    array([-1.#IND    -1.#INDj,  1.        +0.j    ,  1.41421356+0.j    ,
            1.73205081+0.j    ])

    And similarly for ``power`` and ``**``. Note that in addition to
    the warnings, the value for the sqrt(0j) is incorrect.

    3. Mixing arrays and lists in the constructor of array can cause
    it to fail:

    >>> a = na.array([na.array([])]*3)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "C:\Python23\lib\site-packages\numarray\numarraycore.py", 
line 288, in array
        return fromlist(sequence, type, shape)
      File "C:\Python23\lib\site-packages\numarray\numarraycore.py", 
line 175, in fromlist
        arr = _gen.concatenate(l)
      File "C:\Python23\Lib\site-packages\numarray\generic.py", line 
1008, in concatenate
        return _concat(arrs)
      File "C:\Python23\Lib\site-packages\numarray\generic.py", line 
998, in _concat
        dest[ix:ix+a._shape[0]]._copyFrom(a)
    libnumarray.error: copy4bytes: access beyond buffer. offset=3 
buffersize=0

    ``shape`` fails in the same way. Probably other functions as well.

    4. ``linear_algebra.determinant`` returns a length-1 vector when it 
should return a scalar.
    (Actually, I believe it sometimes returns a scalar and sometimes a 
length-1 vector, but
    I can't find a test case to reproduce that).

    >>> a = na.reshape(na.arange(4), (2,2))
    >>> la.determinant(a)
    array([-2.])

    5. Assigning a Numeric slice to a numarray array fails:

    >>> a = na.arange(3)
    >>> b = Numeric.arange(3)
    >>> a[:] = b
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "C:\Python23\Lib\site-packages\numarray\generic.py", line 
505, in _slicedIndexing
        retarr._copyFrom(value)
    TypeError: argument is not array or number

Probable Bugs
=============

Now on to things that are probably bugs, but it's possible that they
represent deliberate changes from Numeric's behavior.

    6. numarray.dot doesn't accept scalars, Numeric.dot does.

    >>> na.dot(1,1)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "C:\Python23\lib\site-packages\numarray\numarraycore.py", 
line 939, in dot
        return ufunc.innerproduct(a, _gen.swapaxes(inputarray(b), -1, -2))
      File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1892, 
in innerproduct
        if a._shape[-1] != b._shape[-1]:
    IndexError: tuple index out of range

    7. na.searchsorted does not accept scalars for its second argument. 
It always
    takes and returns vectors.

    >>> na.searchsorted(na.arange(5), 1.5)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1821, 
in searchsorted
        outarr = _nc.NumArray(shape=len(values), type=Long)
    ValueError: Rank-0 array has no length.
    >>> na.searchsorted(na.arange(5), [1.5])

    8. add.reduce takes dim as a keyword argument instead of axis. It is 
documented_ to take axis.
    I imagine this applies to other opp.reduce methods as well.

    .. _documented: 
http://stsdas.stsci.edu/numarray/Doc/node30.html#SECTION035120000000000000000

    9. ``where`` and probably other functions do not appear to use 
asarray on all of their
    arguments. As a result, nonnumarray sequences are not directly 
usable in these functions as
    they are in their Numeric equivalents. In particular, lists, tuples 
and Numeric arrays do
    not work:

    >>> na.where([0,1,1,0], na.arange(4), [-99,-99,-99,-99])
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "C:\Python23\Lib\site-packages\numarray\generic.py", line 
970, in where
        return choose(ufunc.not_equal(condition, 0), (y,x), out)
      File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1446, 
in __call__
        computation_mode, woutarr, cfunc, ufargs = \
      File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1462, 
in _setup
        convType = _maxPopType(in2)
      File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1401, 
in _maxPopType
        raise TypeError( "Type of '%s' not supported" % repr(x) )
    TypeError: Type of '[-99, -99, -99, -99]' not supported

    10. ``take`` now requires a keyword argument for axis. Attempting 
the to specify the axis
    with a nonkeyword arg results in strange behavior. The docs don't 
appear to describe
    this behavior:

    >>> a = na.reshape(na.arange(9), (3,3))
    >>> a
    array([[0, 1, 2],
           [3, 4, 5],
           [6, 7, 8]])
    >>> na.take(a, [0,1], 1)
    array([1, 4])
    >>> na.take(a, [0,1], axis=1)
    array([[0, 1],
           [3, 4],
           [6, 7]])

    11. ``argmax`` returns shape () arrays instead of scalars when used 
on 1D arrays. These cannot be used
    to index lists

    >>> a = na.arange(9)
    >>> i = na.argmax(a)
    >>> a[i]
    array(8)
    >>> range(9)[i]
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    TypeError: list indices must be integers
    >>> i
    array(8)

Non Bugs
========

These are some things that would probably not be considered bugs, but 
that I'd like to mention
because they either tripped me up or made the conversion and thus my 
life over the last few
days more difficult than it needed to be. Let the whining begin.

    12. ``anArray.conjugate()`` acts in place. 
``aComplexNumber.conjugate()`` returns a new
    number. This seems like a very bad state of affairs. 
``anArray.conjugate()`` should be
    renamed.

    >>> zarray = na.arange(4) * 1j
    >>> zarray
    array([ 0.+0.j,  0.+1.j,  0.+2.j,  0.+3.j])
    >>> zarray.conjugate()
    >>> zarray
    array([ 0.+0.j,  0.-1.j,  0.-2.j,  0.-3.j])
    >>> z = 1+1j
    >>> z.conjugate()
    (1-1j)
    >>> z
    (1+1j)    

    13. ``Error.popMode`` should raise an error if the last mode is 
popped off the stack.
    Currently the error gets raised the next time a numeric operation is 
used which may
    be far away from the inadvertent pop.

    >>> na.Error.popMode() # Error should be here
    _NumErrorMode(overflow='warn', underflow='warn', 
dividebyzero='warn', invalid='warn')
    >>> zarray /= 0 # Not here.
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "C:\Python23\lib\site-packages\numarray\numarraycore.py", 
line 704, in __idiv__
        ufunc.divide(self, operand, self)
      File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 120, 
in handleError
        modes = Error.getMode()
      File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 99, 
in getMode
        return l[-1]
    IndexError: list index out of range

    14. I'm of the opinion that the underflow behavior should default to 
'ignore' not 'warn'.
    Nine times out of ten that's what one wants, and a default that you 
have to override most
    of the time is not a good default. It's possible that this opinion 
may be based on floating
    point naivet, but it's mine and I'm sticking to it for the time being.

    15. Now we're getting to very minor things: Argmax's behavior has 
changed, so that in the
    case of ties, you will get different results than with Numeric. 
Perhaps ``>`` became ``>=``?

    >>> a = na.array([0,1,0,1])
    >>> na.argmax(a)
    array(3)
    >>> np.argmax(a)
    1

    16. ``array_repr`` no longer supports the ``suppress_small`` argument.

    17. ``take`` is really only useful for array types in numarray. In 
Numeric it was sometimes
    useful for choosing stuff from lists of objects. My impression is 
that numarray doesn't try
    to support objects; that's probably OK since Numeric's support was 
pretty iffy.

    18. The fact that array comparison return booleans in numarray broke 
some of my code because I
    do some comparisons and then sum the results. In numarray these get 
summed as Int8 and thus overflow.
    I don't consider this a problem, I just thought I'd mention it in 
case someone else runs into it.

Regards,

Tim Hochberg