Strange behaviour with for loops + numpy arrays
Hi all! I made some "performance" tests with numpy to compare numpy on one cpu with mpi on 4 processesors, and something appears quite strange to me: I have the following code: N = 2**10*4 K = 16000 x = numpy.random.randn(N).astype(numpy.float32) x *= 10**10 print "x:", x t1 = time.time() #do something... for k in xrange(K): x *= 0.99 print "altered x:", x t = time.time()  t1 print "# loops:", K, "time needed:", t, " s " # loops: 1000 time needed: 0.0134310722351 s # loops: 2000 time needed: 0.028107881546 s # loops: 4000 time needed: 0.0367569923401 s # loops: 8000 time needed: 0.075756072998 s # loops: 16000 time needed: 2.11396384239 s So for K = 16000 it didn't need twice the amount of time as expected, it took 20 x more time! After that jump it seem to "normalize" # loops: 32000 time needed: 8.25508499146 s # loops: 64000 time needed: 20.5365290642 s First I suspected xrange was the culprit, but if I tried k = 0 while k < K: x *= 0.99 it changed anything. When I tried simply a=0 for k in xrange(K): a = a+1 none of the effects above triggered, so I suspect that numpy has to be involved. My Hardware is 2.3 GHz Intel Dual Core, 2 GB Ram and Ubuntu 10.04. For my tests I tried it with Python 2.6, and Sage 4.6. (which uses 2.6 too) Also changing the size of arrays or changing the computer didn't help. Has anyone an Idea what had could happen? Kind regards, maldun  Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50, Euro! https://freundschaftswerbung.gmx.de
A Monday 17 January 2011 17:02:43 Stefan Reiterer escriguĂ©:
Hi all!
I made some "performance" tests with numpy to compare numpy on one cpu with mpi on 4 processesors, and something appears quite strange to me:
I have the following code:
N = 2**10*4 K = 16000
x = numpy.random.randn(N).astype(numpy.float32) x *= 10**10 print "x:", x t1 = time.time()
#do something... for k in xrange(K): x *= 0.99
print "altered x:", x
t = time.time()  t1 print "# loops:", K, "time needed:", t, " s "
# loops: 1000 time needed: 0.0134310722351 s # loops: 2000 time needed: 0.028107881546 s # loops: 4000 time needed: 0.0367569923401 s # loops: 8000 time needed: 0.075756072998 s # loops: 16000 time needed: 2.11396384239 s
So for K = 16000 it didn't need twice the amount of time as expected, it took 20 x more time! After that jump it seem to "normalize" # loops: 32000 time needed: 8.25508499146 s # loops: 64000 time needed: 20.5365290642 s
First I suspected xrange was the culprit, but if I tried k = 0 while k < K: x *= 0.99
it changed anything. When I tried simply a=0 for k in xrange(K): a = a+1
none of the effects above triggered, so I suspect that numpy has to be involved. My Hardware is 2.3 GHz Intel Dual Core, 2 GB Ram and Ubuntu 10.04. For my tests I tried it with Python 2.6, and Sage 4.6. (which uses 2.6 too)
Also changing the size of arrays or changing the computer didn't help.
Has anyone an Idea what had could happen?
You are generating denormalized numbers: http://en.wikipedia.org/wiki/Denormal_number Many processors cannot deal efficiently with these beasts in hardware. You may want to convert these numbers to zero if you want more speed.  Francesc Alted
Thanks that was the problem! You never stop to learn =)  OriginalNachricht 
Datum: Mon, 17 Jan 2011 18:22:17 +0100 Von: Francesc Alted <faltet@pytables.org> An: Discussion of Numerical Python <numpydiscussion@scipy.org> Betreff: Re: [Numpydiscussion] Strange behaviour with for loops + numpy arrays
A Monday 17 January 2011 17:02:43 Stefan Reiterer escriguĂ©:
Hi all!
I made some "performance" tests with numpy to compare numpy on one cpu with mpi on 4 processesors, and something appears quite strange to me:
I have the following code:
N = 2**10*4 K = 16000
x = numpy.random.randn(N).astype(numpy.float32) x *= 10**10 print "x:", x t1 = time.time()
#do something... for k in xrange(K): x *= 0.99
print "altered x:", x
t = time.time()  t1 print "# loops:", K, "time needed:", t, " s "
# loops: 1000 time needed: 0.0134310722351 s # loops: 2000 time needed: 0.028107881546 s # loops: 4000 time needed: 0.0367569923401 s # loops: 8000 time needed: 0.075756072998 s # loops: 16000 time needed: 2.11396384239 s
So for K = 16000 it didn't need twice the amount of time as expected, it took 20 x more time! After that jump it seem to "normalize" # loops: 32000 time needed: 8.25508499146 s # loops: 64000 time needed: 20.5365290642 s
First I suspected xrange was the culprit, but if I tried k = 0 while k < K: x *= 0.99
it changed anything. When I tried simply a=0 for k in xrange(K): a = a+1
none of the effects above triggered, so I suspect that numpy has to be involved. My Hardware is 2.3 GHz Intel Dual Core, 2 GB Ram and Ubuntu 10.04. For my tests I tried it with Python 2.6, and Sage 4.6. (which uses 2.6 too)
Also changing the size of arrays or changing the computer didn't help.
Has anyone an Idea what had could happen?
You are generating denormalized numbers:
http://en.wikipedia.org/wiki/Denormal_number
Many processors cannot deal efficiently with these beasts in hardware. You may want to convert these numbers to zero if you want more speed.
 Francesc Alted _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
 GMX DSL DoppelFlat ab 19,99 Euro/mtl.! Jetzt mit gratis HandyFlat! http://portal.gmx.net/de/go/dsl
participants (2)

Francesc Alted

Stefan Reiterer