Monte Carlo Method and pi beliavsky at
Tue Jul 13 01:40:24 CEST 2004

Tim Hochberg <tim.hochberg at> wrote:
>beliavsky at wrote:
>> Dan Christensen <jdc at> wrote in message news:<87llhqbbh5.fsf at>...
>>>I got to thinking about this problem.  The most direct implementation
>>>something like:
>>>from random import random
>>>from math import sqrt
>>>def v1(n = 500000):
>>>    rand = random
>>>    sqr  = sqrt
>>>    sm = 0
>>>    for i in xrange(n):
>>>        sm += sqr(1-rand()**2)
>>>    return 4*sm/n
>>>This takes about 0.68s on a 2.66GHz P4 (using Python 2.4a) and
>>>about 0.29s with psyco (using Python 2.3.4).
>I can more than double the speed of this under psyco by replacing **2 
>with x*x. I have inside information that pow is extra slow on psyco(*) 
>plus, as I understand it, x*x is preferred over **2 anyway.
>from math import sqrty
>from random import random
>def v3(n = 500000):
>     sm = 0
>     for i in range(n):
>         x = random()
>         sm += sqrt(1-x*x)
>     return 4*sm/n
>(*) Psyco's support for floating point operations is considerable slower

>than its support for integer operations. The latter are optimized all 
>the way to assembly when possible, but the latter are, at best, 
>optimized into C function calls that perform the operation. Thus there 
>is always some function call overhead. Fixing this would require someone

>who groks both the guts of Psyco and the intracies of x86 floating point

>machine code to step forward. In the case of pow, I believe that there 
>is always callback into the python interpreter, so it's extra slow.
>>>One of the great things about python is that it's possible (and fun)
>>>to move loops into the C level, e.g. with this implementation:
>>>def v2(n = 500000):
>>>    a = numarray.random_array.uniform(0, 1, n)
>>>    return 4 * numarray.sum(numarray.sqrt(1 - a**2)) / n
>>>which clocks in at 0.16s with Python 2.4a.  (psyco doesn't help.)
>The same trick (replacing a**2 with a*a) also double the speed of this 
>def v4(n = 500000):
>      a = numarray.random_array.uniform(0, 1, n)
>      return 4 * numarray.sum(numarray.sqrt(1 - a*a)) / n
>> For comparison, on a 2.8GHz P4 the Fortran code to compute pi using
>> simulation runs in about 0.09s (measured using timethis.exe on Windows
>> XP), REGARDLESS of whether the calcuation is done with a loop or an
>> array. If 5e6 instead of 5e5 random numbers are used, the time taken
>> is about 0.4s, and the scaling with #iterations is about linear beyond
>> that. Here are the codes, which were compiled with the -optimize:4
>> option using Compaq Visual Fortran 6.6c.
>For yet more comparison, here are times on a 2.4 GHz P4, timed with 
>timeit (10 loops):
>v1 0.36123797857
>v2 0.289118589094
>v3 0.158556464579
>v4 0.148470238536
>If one takes into the accout the speed difference of the two CPUs this 
>puts the both the numarray and psyco solutions within about 50% of the 
>Fortran solution, which I'm impressed by. Of course, the Fortran code 
>uses x**2 instead of x*x, so it too, might be sped by some tweaks.

With Compaq Visual Fortran, the speed with x**2 and x*x are the same. Recently
on comp.lang.fortran, a Fortran expert opined that a compiler that did not
optimize the calculation of integer powers to be as fast as the hand-coded
equivalent would not be taken seriously.

Having to write out integer powers in Python is awkward, although the performance
improvement may be worth it. The code x*x*x*x is less legible than x**4,
especially if 'x' is replaced 'longname[1,2,3]'. 

Often, what you want to compute powers of is itself an expression. Is Python
with Psyco able to run code such as 

z = sin(x)*sin(x)

as fast as 

y = sin(x)
z = y*y ?

Does it make two calls to sin()? There is a performance hit for 'y = sin(x)*sin(x)'
when not using Psyco.

----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---

More information about the Python-list mailing list