[There is a html version of this not at http://starship.python.net/~hochberg/conversion.html] ==================================== Numeric to numarray Conversion Notes ==================================== I finally bit the bullet over the last few days and moved my current project from using Numeric to using numarray. This was a reasonably large undertaking, involving the modification of in excess of thirty modules. For the most part it went smoothly, requiring only the replacement of ``import Numeric`` with ``import numarray`` [1]_. However, in the course of the move I ran into several bugs, as well as quite a few things that may be bugs or may be deliberate changes from Numeric. I took some notes as the conversion progressed which I will attempt to render here in some halfway decipherable form. There are still some interoperability issues with Numeric and numarray, not all of which I took the time to track down. In many cases, I solved problems that cropped up in the conversion simply by converting some more of the code and thus reducing the amount of mixed operations. The ones that I did track down are reported below. I'm using numarray 0.7 with Python 2.3 on Windows XP. .. [1] Actually, to be strictly accurate, I replaced ``import Numeric as np`` with ``import numarray as na`` and then replaced ``np`` with ``na``. Bugs ==== The following few things are almost certainly bugs. I haven't had time to dig into them in any depth, but I have tried to reduce them each to a small failing case: 1. Copying a slice of an array onto a different slice of the same array fails. >>> y = na.arange(4) >>> y[1:] = y[:-1] >>> y # Should be array([0, 0, 1, 2]) array([0, 0, 0, 0]) 2. ``sqrt``, ``Power``, `and ``**`` all fail on complex zero (0j). >>> y = na.arange(4) + 0j >>> na.sqrt(y) Warning: Encountered invalid numeric result(s) in sqrt Warning: Encountered divide by zero(s) in sqrt Warning: Encountered invalid numeric result(s) in not_equal Warning: Encountered invalid numeric result(s) in not_equal array([-1.#IND -1.#INDj, 1. +0.j , 1.41421356+0.j , 1.73205081+0.j ]) And similarly for ``power`` and ``**``. Note that in addition to the warnings, the value for the sqrt(0j) is incorrect. 3. Mixing arrays and lists in the constructor of array can cause it to fail: >>> a = na.array([na.array([])]*3) Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\lib\site-packages\numarray\numarraycore.py", line 288, in array return fromlist(sequence, type, shape) File "C:\Python23\lib\site-packages\numarray\numarraycore.py", line 175, in fromlist arr = _gen.concatenate(l) File "C:\Python23\Lib\site-packages\numarray\generic.py", line 1008, in concatenate return _concat(arrs) File "C:\Python23\Lib\site-packages\numarray\generic.py", line 998, in _concat dest[ix:ix+a._shape[0]]._copyFrom(a) libnumarray.error: copy4bytes: access beyond buffer. offset=3 buffersize=0 ``shape`` fails in the same way. Probably other functions as well. 4. ``linear_algebra.determinant`` returns a length-1 vector when it should return a scalar. (Actually, I believe it sometimes returns a scalar and sometimes a length-1 vector, but I can't find a test case to reproduce that). >>> a = na.reshape(na.arange(4), (2,2)) >>> la.determinant(a) array([-2.]) 5. Assigning a Numeric slice to a numarray array fails: >>> a = na.arange(3) >>> b = Numeric.arange(3) >>> a[:] = b Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\Lib\site-packages\numarray\generic.py", line 505, in _slicedIndexing retarr._copyFrom(value) TypeError: argument is not array or number Probable Bugs ============= Now on to things that are probably bugs, but it's possible that they represent deliberate changes from Numeric's behavior. 6. numarray.dot doesn't accept scalars, Numeric.dot does. >>> na.dot(1,1) Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\lib\site-packages\numarray\numarraycore.py", line 939, in dot return ufunc.innerproduct(a, _gen.swapaxes(inputarray(b), -1, -2)) File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1892, in innerproduct if a._shape[-1] != b._shape[-1]: IndexError: tuple index out of range 7. na.searchsorted does not accept scalars for its second argument. It always takes and returns vectors. >>> na.searchsorted(na.arange(5), 1.5) Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1821, in searchsorted outarr = _nc.NumArray(shape=len(values), type=Long) ValueError: Rank-0 array has no length. >>> na.searchsorted(na.arange(5), [1.5]) 8. add.reduce takes dim as a keyword argument instead of axis. It is documented_ to take axis. I imagine this applies to other opp.reduce methods as well. .. _documented: http://stsdas.stsci.edu/numarray/Doc/node30.html#SECTION03512000000000000000... 9. ``where`` and probably other functions do not appear to use asarray on all of their arguments. As a result, nonnumarray sequences are not directly usable in these functions as they are in their Numeric equivalents. In particular, lists, tuples and Numeric arrays do not work: >>> na.where([0,1,1,0], na.arange(4), [-99,-99,-99,-99]) Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\Lib\site-packages\numarray\generic.py", line 970, in where return choose(ufunc.not_equal(condition, 0), (y,x), out) File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1446, in __call__ computation_mode, woutarr, cfunc, ufargs = \ File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1462, in _setup convType = _maxPopType(in2) File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1401, in _maxPopType raise TypeError( "Type of '%s' not supported" % repr(x) ) TypeError: Type of '[-99, -99, -99, -99]' not supported 10. ``take`` now requires a keyword argument for axis. Attempting the to specify the axis with a nonkeyword arg results in strange behavior. The docs don't appear to describe this behavior: >>> a = na.reshape(na.arange(9), (3,3)) >>> a array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) >>> na.take(a, [0,1], 1) array([1, 4]) >>> na.take(a, [0,1], axis=1) array([[0, 1], [3, 4], [6, 7]]) 11. ``argmax`` returns shape () arrays instead of scalars when used on 1D arrays. These cannot be used to index lists >>> a = na.arange(9) >>> i = na.argmax(a) >>> a[i] array(8) >>> range(9)[i] Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: list indices must be integers >>> i array(8) Non Bugs ======== These are some things that would probably not be considered bugs, but that I'd like to mention because they either tripped me up or made the conversion and thus my life over the last few days more difficult than it needed to be. Let the whining begin. 12. ``anArray.conjugate()`` acts in place. ``aComplexNumber.conjugate()`` returns a new number. This seems like a very bad state of affairs. ``anArray.conjugate()`` should be renamed. >>> zarray = na.arange(4) * 1j >>> zarray array([ 0.+0.j, 0.+1.j, 0.+2.j, 0.+3.j]) >>> zarray.conjugate() >>> zarray array([ 0.+0.j, 0.-1.j, 0.-2.j, 0.-3.j]) >>> z = 1+1j >>> z.conjugate() (1-1j) >>> z (1+1j) 13. ``Error.popMode`` should raise an error if the last mode is popped off the stack. Currently the error gets raised the next time a numeric operation is used which may be far away from the inadvertent pop. >>> na.Error.popMode() # Error should be here _NumErrorMode(overflow='warn', underflow='warn', dividebyzero='warn', invalid='warn') >>> zarray /= 0 # Not here. Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\lib\site-packages\numarray\numarraycore.py", line 704, in __idiv__ ufunc.divide(self, operand, self) File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 120, in handleError modes = Error.getMode() File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 99, in getMode return l[-1] IndexError: list index out of range 14. I'm of the opinion that the underflow behavior should default to 'ignore' not 'warn'. Nine times out of ten that's what one wants, and a default that you have to override most of the time is not a good default. It's possible that this opinion may be based on floating point naivet, but it's mine and I'm sticking to it for the time being. 15. Now we're getting to very minor things: Argmax's behavior has changed, so that in the case of ties, you will get different results than with Numeric. Perhaps ``>`` became ``>=``? >>> a = na.array([0,1,0,1]) >>> na.argmax(a) array(3) >>> np.argmax(a) 1 16. ``array_repr`` no longer supports the ``suppress_small`` argument. 17. ``take`` is really only useful for array types in numarray. In Numeric it was sometimes useful for choosing stuff from lists of objects. My impression is that numarray doesn't try to support objects; that's probably OK since Numeric's support was pretty iffy. 18. The fact that array comparison return booleans in numarray broke some of my code because I do some comparisons and then sum the results. In numarray these get summed as Int8 and thus overflow. I don't consider this a problem, I just thought I'd mention it in case someone else runs into it. Regards, Tim Hochberg
Tim Hochberg wrote:
[There is a html version of this not at http://starship.python.net/~hochberg/conversion.html]
[grrr] That should of course be 'this note'. -tim
Tim Hochberg <tim.hochberg@ieee.org>:
Tim Hochberg wrote:
[There is a html version of this not at http://starship.python.net/~hochberg/conversion.html]
[grrr] That should of course be 'this note'.
Heh. Yeah, I thought the informational content in the statement was a bit low ;)
-tim
-- Magnus Lie Hetland "In this house we obey the laws of http://hetland.org thermodynamics!" Homer Simpson
Nice job Tim! I'll enter the bugs individually and collectively on source forge. From the looks of it, it'll be a while before they're all sorted out. Best regards, Todd On Mon, 2003-11-10 at 11:26, Tim Hochberg wrote:
[There is a html version of this not at http://starship.python.net/~hochberg/conversion.html]
==================================== Numeric to numarray Conversion Notes ====================================
I finally bit the bullet over the last few days and moved my current project from using Numeric to using numarray. This was a reasonably large undertaking, involving the modification of in excess of thirty modules. For the most part it went smoothly, requiring only the replacement of ``import Numeric`` with ``import numarray`` [1]_. However, in the course of the move I ran into several bugs, as well as quite a few things that may be bugs or may be deliberate changes from Numeric. I took some notes as the conversion progressed which I will attempt to render here in some halfway decipherable form.
There are still some interoperability issues with Numeric and numarray, not all of which I took the time to track down. In many cases, I solved problems that cropped up in the conversion simply by converting some more of the code and thus reducing the amount of mixed operations. The ones that I did track down are reported below.
I'm using numarray 0.7 with Python 2.3 on Windows XP.
.. [1] Actually, to be strictly accurate, I replaced ``import Numeric as np`` with ``import numarray as na`` and then replaced ``np`` with ``na``.
Bugs ====
The following few things are almost certainly bugs. I haven't had time to dig into them in any depth, but I have tried to reduce them each to a small failing case:
1. Copying a slice of an array onto a different slice of the same array fails.
>>> y = na.arange(4) >>> y[1:] = y[:-1] >>> y # Should be array([0, 0, 1, 2]) array([0, 0, 0, 0])
2. ``sqrt``, ``Power``, `and ``**`` all fail on complex zero (0j).
>>> y = na.arange(4) + 0j >>> na.sqrt(y) Warning: Encountered invalid numeric result(s) in sqrt Warning: Encountered divide by zero(s) in sqrt Warning: Encountered invalid numeric result(s) in not_equal Warning: Encountered invalid numeric result(s) in not_equal array([-1.#IND -1.#INDj, 1. +0.j , 1.41421356+0.j , 1.73205081+0.j ])
And similarly for ``power`` and ``**``. Note that in addition to the warnings, the value for the sqrt(0j) is incorrect.
3. Mixing arrays and lists in the constructor of array can cause it to fail:
>>> a = na.array([na.array([])]*3) Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\lib\site-packages\numarray\numarraycore.py", line 288, in array return fromlist(sequence, type, shape) File "C:\Python23\lib\site-packages\numarray\numarraycore.py", line 175, in fromlist arr = _gen.concatenate(l) File "C:\Python23\Lib\site-packages\numarray\generic.py", line 1008, in concatenate return _concat(arrs) File "C:\Python23\Lib\site-packages\numarray\generic.py", line 998, in _concat dest[ix:ix+a._shape[0]]._copyFrom(a) libnumarray.error: copy4bytes: access beyond buffer. offset=3 buffersize=0
``shape`` fails in the same way. Probably other functions as well.
4. ``linear_algebra.determinant`` returns a length-1 vector when it should return a scalar. (Actually, I believe it sometimes returns a scalar and sometimes a length-1 vector, but I can't find a test case to reproduce that).
>>> a = na.reshape(na.arange(4), (2,2)) >>> la.determinant(a) array([-2.])
5. Assigning a Numeric slice to a numarray array fails:
>>> a = na.arange(3) >>> b = Numeric.arange(3) >>> a[:] = b Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\Lib\site-packages\numarray\generic.py", line 505, in _slicedIndexing retarr._copyFrom(value) TypeError: argument is not array or number
Probable Bugs =============
Now on to things that are probably bugs, but it's possible that they represent deliberate changes from Numeric's behavior.
6. numarray.dot doesn't accept scalars, Numeric.dot does.
>>> na.dot(1,1) Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\lib\site-packages\numarray\numarraycore.py", line 939, in dot return ufunc.innerproduct(a, _gen.swapaxes(inputarray(b), -1, -2)) File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1892, in innerproduct if a._shape[-1] != b._shape[-1]: IndexError: tuple index out of range
7. na.searchsorted does not accept scalars for its second argument. It always takes and returns vectors.
>>> na.searchsorted(na.arange(5), 1.5) Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1821, in searchsorted outarr = _nc.NumArray(shape=len(values), type=Long) ValueError: Rank-0 array has no length. >>> na.searchsorted(na.arange(5), [1.5])
8. add.reduce takes dim as a keyword argument instead of axis. It is documented_ to take axis. I imagine this applies to other opp.reduce methods as well.
.. _documented: http://stsdas.stsci.edu/numarray/Doc/node30.html#SECTION03512000000000000000...
9. ``where`` and probably other functions do not appear to use asarray on all of their arguments. As a result, nonnumarray sequences are not directly usable in these functions as they are in their Numeric equivalents. In particular, lists, tuples and Numeric arrays do not work:
>>> na.where([0,1,1,0], na.arange(4), [-99,-99,-99,-99]) Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\Lib\site-packages\numarray\generic.py", line 970, in where return choose(ufunc.not_equal(condition, 0), (y,x), out) File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1446, in __call__ computation_mode, woutarr, cfunc, ufargs = \ File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1462, in _setup convType = _maxPopType(in2) File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 1401, in _maxPopType raise TypeError( "Type of '%s' not supported" % repr(x) ) TypeError: Type of '[-99, -99, -99, -99]' not supported
10. ``take`` now requires a keyword argument for axis. Attempting the to specify the axis with a nonkeyword arg results in strange behavior. The docs don't appear to describe this behavior:
>>> a = na.reshape(na.arange(9), (3,3)) >>> a array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) >>> na.take(a, [0,1], 1) array([1, 4]) >>> na.take(a, [0,1], axis=1) array([[0, 1], [3, 4], [6, 7]])
11. ``argmax`` returns shape () arrays instead of scalars when used on 1D arrays. These cannot be used to index lists
>>> a = na.arange(9) >>> i = na.argmax(a) >>> a[i] array(8) >>> range(9)[i] Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: list indices must be integers >>> i array(8)
Non Bugs ========
These are some things that would probably not be considered bugs, but that I'd like to mention because they either tripped me up or made the conversion and thus my life over the last few days more difficult than it needed to be. Let the whining begin.
12. ``anArray.conjugate()`` acts in place. ``aComplexNumber.conjugate()`` returns a new number. This seems like a very bad state of affairs. ``anArray.conjugate()`` should be renamed.
>>> zarray = na.arange(4) * 1j >>> zarray array([ 0.+0.j, 0.+1.j, 0.+2.j, 0.+3.j]) >>> zarray.conjugate() >>> zarray array([ 0.+0.j, 0.-1.j, 0.-2.j, 0.-3.j]) >>> z = 1+1j >>> z.conjugate() (1-1j) >>> z (1+1j)
13. ``Error.popMode`` should raise an error if the last mode is popped off the stack. Currently the error gets raised the next time a numeric operation is used which may be far away from the inadvertent pop.
>>> na.Error.popMode() # Error should be here _NumErrorMode(overflow='warn', underflow='warn', dividebyzero='warn', invalid='warn') >>> zarray /= 0 # Not here. Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\lib\site-packages\numarray\numarraycore.py", line 704, in __idiv__ ufunc.divide(self, operand, self) File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 120, in handleError modes = Error.getMode() File "C:\Python23\lib\site-packages\numarray\ufunc.py", line 99, in getMode return l[-1] IndexError: list index out of range
14. I'm of the opinion that the underflow behavior should default to 'ignore' not 'warn'. Nine times out of ten that's what one wants, and a default that you have to override most of the time is not a good default. It's possible that this opinion may be based on floating point naivet, but it's mine and I'm sticking to it for the time being.
15. Now we're getting to very minor things: Argmax's behavior has changed, so that in the case of ties, you will get different results than with Numeric. Perhaps ``>`` became ``>=``?
>>> a = na.array([0,1,0,1]) >>> na.argmax(a) array(3) >>> np.argmax(a) 1
16. ``array_repr`` no longer supports the ``suppress_small`` argument.
17. ``take`` is really only useful for array types in numarray. In Numeric it was sometimes useful for choosing stuff from lists of objects. My impression is that numarray doesn't try to support objects; that's probably OK since Numeric's support was pretty iffy.
18. The fact that array comparison return booleans in numarray broke some of my code because I do some comparisons and then sum the results. In numarray these get summed as Int8 and thus overflow. I don't consider this a problem, I just thought I'd mention it in case someone else runs into it.
Regards,
Tim Hochberg
------------------------------------------------------- This SF.Net email sponsored by: ApacheCon 2003, 16-19 November in Las Vegas. Learn firsthand the latest developments in Apache, PHP, Perl, XML, Java, MySQL, WebDAV, and more! http://www.apachecon.com/ _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576
Here's one more thing I just ran into. It's related to over/underflow handling, so it's even topical. I think that the _pushmodes method of NumError (in unfunc.py) should be using _gemodestack(0), not getmodestack(1). That is:: def _pushmodes(self, modes): l = self._getmodestack(0) # XXX changed be TAH l.append(modes) What can happen now is: - New thread is started. - pushMode is called before any calls to getMode, resulting in a modestack one deep. - the corresponding popMode is called, resulting in an illegal stack that is zero deep. - there is an over/underflow and getMode fails because of the zero length stack. While I'm on the subject, I think that the way the default stack entry is created should be changed. Currently there is no way to assign a default error mode for all threads. The simplest approach would be add a setDefaultMode method to Error. Error._defaultmode would be set in __init__ to _NumErrorMode(), but could subsequently be re-set with setDefaultMode. _defaultmode would be used to initialize the default value in _getmodestack. Specifically: class NumError: def __init__(self, all="warn", overflow=None, underflow=None, dividebyzero=None, invalid=None): self._defaultmode = _NumErrorMode() self._modestack = {} # map of stacks indexed by thread id self.setMode(all=all, underflow=underflow, overflow=overflow, dividebyzero=dividebyzero, invalid=invalid) def _getmodestack(self, empty_default=0): id = safethread.get_ident() try: l = self._modestack[id] except KeyError: if empty_default: l = [] else: l = [self._defaultmode ] self._modestack[id] = l return l def setDefaultMode(self, all="warn", overflow=None, underflow=None, dividebyzero=None, invalid=None): self._defaultmode = _NumErrorMode(all, overflow, underflow, dividebyzero, invalid) ) #.... While I'm at it, what's the point of empty_default? I can't figure out when it would be useful. -tim
participants (3)
-
Magnus Lie Hetland -
Tim Hochberg -
Todd Miller