numpy performance and random numbers

Carl Johan Rehn care02 at gmail.com
Sat Dec 19 08:06:53 EST 2009


On Dec 19, 12:29 pm, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> On Sat, 19 Dec 2009 02:05:17 -0800, Carl Johan Rehn wrote:
> > Dear friends,
>
> > I plan to port a Monte Carlo engine from Matlab to Python. However, when
> > I timed randn(N1, N2) in Python and compared it with Matlab's randn,
>
> What's randn? I don't know that function. I know the randint, random, and
> randrange functions, but not randn. Where does it come from?
>
> > Matlab came out as a clear winner with a speedup of 3-4 times. This was
> > truly disappointing. I ran tthis test on a Win32 machine and without the
> > Atlas library.
>
> > Why is there such a large difference in speed and how can I improve the
> > speed of randn in Python! Any help with this matter is truly appreciated
> > since I really would like to get away from Matlab and move over to
> > Python instead.
>
> Could be many reasons. Python could be generally slower than Matlab. Your
> timing code might have been faulty and you weren't comparing equal
> amounts of work (if your Python code was doing four times as much work as
> the Matlab code, then naturally it will be four times slower). Perhaps
> the Matlab random number generator is a low-quality generator which is
> fast but not very random. Python uses a very high quality RNG which is
> not cheap.
>
> But does it really matter if the RNG is slower? Your Monte Carlo engine
> is a lot more than just a RNG. What matters is whether the engine as a
> whole is faster or slower, not whether one small component is slower.
>
> --
> Steven

randn is given by

>> import numpy
>>> numpy.random.randn(2,3)
array([[-2.66435181, -0.32486419,  0.12742156],
       [-0.2387061 , -0.55894044,  1.20750493]])

Generally, at least in my MC application, I need a large number of
random numbers. Usually I execute, for example, r = randn(100, 10000)
sequentially a relatively large number of times until sufficient
accuracy has been reached. Thus, randn is in my case a mission
critical component for obtaining an acceptable overall run time.
Matlab and numpy have (by chance?) the exact names for the same
functionality, so I was very happy with numpy's implementation until I
timed it. So the basioc question is, how can I speed up random number
generation?

Carl






More information about the Python-list mailing list