Logical operations, Numeric.sum() and overflow
Hi all, Importing 'scipy' changes the output of the following code:
import random import Numeric, RandomArray (m,n) = (1000,3) x = RandomArray.random((m,n)) y = x < 0.5 assert sum(y) == Numeric.sum(y)
from nothing to an AssertionError. This is random behaviour: the error occurs about 90% of the time with this value of m on my PC (NumPy 23.1, SciPy 0.3.2, Python 2.4.1, Linux 2.6.11) , but reducing m to 500 makes it occur only about 20% of the time. There appear to be two causes: (1) importing scipy changes the behaviour of "y = x < 0.5" to return an array of typecode 'b' rather than 'l'. (2) Numeric.sum() is prone to overflow errors, returning an array of type 'b' rather than increasing precision:
a = Numeric.array([[253,254,255],[1,1,1]],'b') Numeric.sum(a) array([254, 255, 0],'b')
Here's my two cents. On point (1), unit tests are needed to ensure a simple 'import scipy' can't change the behaviour of unrelated Numpy code. (How is this even possible?) Point (2) seems to indicate a design flaw with Numeric. How do Octave and Matlab deal with this? Whatever they do, it "just works", whereas Numeric feels "broken" in this respect; this overflow propagates through other operations (Numeric.average() in my case), and finding such bugs can take hours. Any suggestions / ideas? -- Ed
On Tue, 2005-04-19 at 18:48 +0200, Ed Schofield wrote:
Hi all,
Importing 'scipy' changes the output of the following code:
import random import Numeric, RandomArray (m,n) = (1000,3) x = RandomArray.random((m,n)) y = x < 0.5 assert sum(y) == Numeric.sum(y)
from nothing to an AssertionError.
This is random behaviour: the error occurs about 90% of the time with this value of m on my PC (NumPy 23.1, SciPy 0.3.2, Python 2.4.1, Linux 2.6.11) , but reducing m to 500 makes it occur only about 20% of the time.
There appear to be two causes: (1) importing scipy changes the behaviour of "y = x < 0.5" to return an array of typecode 'b' rather than 'l'. (2) Numeric.sum() is prone to overflow errors, returning an array of type 'b' rather than increasing precision:
a = Numeric.array([[253,254,255],[1,1,1]],'b') Numeric.sum(a) array([254, 255, 0],'b')
Here's my two cents. On point (1), unit tests are needed to ensure a simple 'import scipy' can't change the behaviour of unrelated Numpy code. (How is this even possible?)
Point (2) seems to indicate a design flaw with Numeric. How do Octave and Matlab deal with this? Whatever they do, it "just works", whereas Numeric feels "broken" in this respect; this overflow propagates through other operations (Numeric.average() in my case), and finding such bugs can take hours. Any suggestions / ideas?
-- Ed
Numarray uses the byte boolean types also and has given me similar problems. I haven't felt the need to complain, but it is annoying. I suppose the byte type saves a bit of space, but frankly I think it would be better to stick to plain old integers as c does. chuck
participants (2)
-
Charles R Harris -
Ed Schofield