speed of random number generator compared to Julia
![](https://secure.gravatar.com/avatar/f14373168edcc55c5f2598a40de55c0d.jpg?s=120&d=mm&r=g)
Hello, I have a simulation code which uses intensively random number generation (e.g. normal rng). I wanted to compare the performance of this simulation across some Numpy implementations and some Julia implementation. In the end, in both languages, the respective best performing implementations are dominated by the rng, but Julia wins by a factor of 4-5, because the rng is 4-5x faster. Here are some timing results: I*n IPython (Python 3.5.2 from Anaconda)* import numpy as np N = 10**6 %timeit np.random.normal(size=N) 10 loops, best of 3: 37.1 ms per loop %timeit np.random.uniform(size=N) 100 loops, best of 3: 10.2 ms per loop *In Julia (0.4.7 x86_64 from Debian testing)* N = 10^6 @time randn(N); 0.007802 seconds (6 allocations: 7.630 MB) @time rand(N); 0.002059 seconds (8 allocations: 7.630 MB) (with some variations between trials) Results are consistent in the sense that generating Gaussian numbers is 3-4 times slower than uniform in Python and in Julia. But how come Julia is 4-5x faster since Numpy uses C implementation for the entire process ? (Mersenne Twister -> uniform double -> Box-Muller transform to get a Gaussian https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c). Also I noticed that Julia uses a different algorithm (Ziggurat Method from Marsaglia and Tsang , https://github.com/JuliaLang/julia/blob/master/base/random.jl#L700) but this doesn't explain the difference for uniform rng. best, Pierre
![](https://secure.gravatar.com/avatar/dce2259ff9b547103d54acf1ea622314.jpg?s=120&d=mm&r=g)
On Mon, Apr 3, 2017 at 3:20 PM, Pierre Haessig <pierre.haessig@crans.org> wrote:
This <https://github.com/JuliaLang/julia/blob/7fb758a275a0b4cf0e3f4cbf482c065cb32f...> says that Julia uses this library <http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/#dSFMT>, which is different from the home brewed version of the Mersenne twister in NumPy. The second link I posted claims their speed comes from generating double precision numbers directly, rather than generating random bytes that have to be converted to doubles, as is the case of NumPy through this magical incantation <https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c#L...>. They also throw the SIMD acronym around, which likely means their random number generation is parallelized. My guess is that most of the speed-up comes from the SIMD parallelization: the Mersenne algorithm does a lot of work <https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c#L...> to produce 32 random bits, so that likely dominates over a couple of arithmetic operations, even if divisions are involved. Jaime Do you think Stackoverflow would be a better place for my question?
-- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
![](https://secure.gravatar.com/avatar/0b7d465c9e16b93623fd6926775b91eb.jpg?s=120&d=mm&r=g)
Take a look here: https://bashtage.github.io/ng-numpy-randomstate/doc/index.html On Mon, Apr 3, 2017 at 9:45 AM Jaime Fernández del Río <jaime.frio@gmail.com> wrote:
![](https://secure.gravatar.com/avatar/f14373168edcc55c5f2598a40de55c0d.jpg?s=120&d=mm&r=g)
Le 03/04/2017 à 15:52, Neal Becker a écrit :
So it is indeed possible to have in Python/Numpy both the "advanced" Mersenne Twister (dSFMT) at the lower level and the Ziggurat algorithm for Gaussian transform on top. Perfect! In an ideal world, this would be implemented by default in Numpy, but I understand that this would break the reproducibility of existing codes. best, Pierre
![](https://secure.gravatar.com/avatar/f14373168edcc55c5f2598a40de55c0d.jpg?s=120&d=mm&r=g)
Le 03/04/2017 à 15:44, Jaime Fernández del Río a écrit :
I'm not good in enough in reading Julia to be 100% sure, but I feel like that the random.jl (https://github.com/JuliaLang/julia/blob/master/base/random.jl) contains a Julia implementation of Mersenne Twister... but I have no idea whether it is the "fancy" SIMD version or the "old" 32bits version. best, Pierre
![](https://secure.gravatar.com/avatar/dce2259ff9b547103d54acf1ea622314.jpg?s=120&d=mm&r=g)
On Mon, Apr 3, 2017 at 3:20 PM, Pierre Haessig <pierre.haessig@crans.org> wrote:
This <https://github.com/JuliaLang/julia/blob/7fb758a275a0b4cf0e3f4cbf482c065cb32f...> says that Julia uses this library <http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/#dSFMT>, which is different from the home brewed version of the Mersenne twister in NumPy. The second link I posted claims their speed comes from generating double precision numbers directly, rather than generating random bytes that have to be converted to doubles, as is the case of NumPy through this magical incantation <https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c#L...>. They also throw the SIMD acronym around, which likely means their random number generation is parallelized. My guess is that most of the speed-up comes from the SIMD parallelization: the Mersenne algorithm does a lot of work <https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c#L...> to produce 32 random bits, so that likely dominates over a couple of arithmetic operations, even if divisions are involved. Jaime Do you think Stackoverflow would be a better place for my question?
-- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
![](https://secure.gravatar.com/avatar/0b7d465c9e16b93623fd6926775b91eb.jpg?s=120&d=mm&r=g)
Take a look here: https://bashtage.github.io/ng-numpy-randomstate/doc/index.html On Mon, Apr 3, 2017 at 9:45 AM Jaime Fernández del Río <jaime.frio@gmail.com> wrote:
![](https://secure.gravatar.com/avatar/f14373168edcc55c5f2598a40de55c0d.jpg?s=120&d=mm&r=g)
Le 03/04/2017 à 15:52, Neal Becker a écrit :
So it is indeed possible to have in Python/Numpy both the "advanced" Mersenne Twister (dSFMT) at the lower level and the Ziggurat algorithm for Gaussian transform on top. Perfect! In an ideal world, this would be implemented by default in Numpy, but I understand that this would break the reproducibility of existing codes. best, Pierre
![](https://secure.gravatar.com/avatar/f14373168edcc55c5f2598a40de55c0d.jpg?s=120&d=mm&r=g)
Le 03/04/2017 à 15:44, Jaime Fernández del Río a écrit :
I'm not good in enough in reading Julia to be 100% sure, but I feel like that the random.jl (https://github.com/JuliaLang/julia/blob/master/base/random.jl) contains a Julia implementation of Mersenne Twister... but I have no idea whether it is the "fancy" SIMD version or the "old" 32bits version. best, Pierre
participants (3)
-
Jaime Fernández del Río
-
Neal Becker
-
Pierre Haessig