[Numpy-discussion] numpy.random and multiprocessing
Sturla Molden
sturla at molden.no
Thu Dec 11 13:04:21 EST 2008
On 12/11/2008 6:29 PM, David Cournapeau wrote:
> def task(x):
> np.random.seed()
> return np.random.random(x)
>
> But does this really make sense ?
Hard to say... There is a chance of this producing indentical or
overlapping sequences, albeit unlikely. I would not do this. I'd make
one process responsible for making the random numbers and write those to
a queue. It would scale if generating the deviates is the least costly
part of the algorithm.
Sturla Molden
=== test.py ===
from test_helper import task, generator
from multiprocessing import Pool, Process, Queue
q = Queue(maxsize=32) # or whatever
g = Process(args=(4,q)) # preferably a number much larger than 4!!!
g.start()
p = Pool(4)
jobs = list()
for i in range(4):
jobs.append(p.apply_async(task, (q,)))
print [j.get() for j in jobs]
p.close()
p.join()
g.terminate()
=== test_helper.py ===
import numpy as np
def generator(x, q):
while 1:
item = np.random.random(x)
q.put(item)
def task(q):
return q.get()
> Is the goal to parallelize a big sampler into N tasks of M trials, to
> produce the same result as a sequential set of M*N trials ? Then it does
> sound like a trivial task at all. I know there exists libraries
> explicitly designed for parallel random number generation - maybe this
> is where we should look, instead of using heuristics which are likely to
> be bogus, and generate wrong results.
>
> cheers,
>
> David
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
More information about the NumPy-Discussion
mailing list