[Numpy-discussion] numpy.random and multiprocessing
David Cournapeau
david at ar.media.kyoto-u.ac.jp
Thu Dec 11 12:04:30 EST 2008
Michael Gilbert wrote:
>> Exactly, change task_helper.py to
>>
>> ----
>> import numpy as np
>>
>> def task(x):
>> import os
>> print "Hi, I'm", os.getpid()
>> return np.random.random(x)
>> ----
>>
>> and note the output
>>
>> ----
>> Hi, I'm 16197
>> Hi, I'm 16198
>> Hi, I'm 16199
>> Hi, I'm 16199
>> [ 0.58175647 0.16293922 0.30488182 0.67367263]
>> [ 0.58175647 0.16293922 0.30488182 0.67367263]
>> [ 0.58175647 0.16293922 0.30488182 0.67367263]
>> [ 0.59574921 0.61554857 0.06155764 0.75352295]
>>
>
> Shouldn't numpy (and/or multiprocessing) be smart enough to prevent
> this kind of error? A simple enough solution would be to also include
> the process id as part of the seed since it appears that the problem
> only occurs when you have different processes/threads accessing the
> random number generator at the same time.
>
But the seed is set only once in the above code. So the problem has
nothing to do with numpy. I don't think using the pid as a seed is a
good idea either - for each task, it should be set to a true random source.
David
More information about the NumPy-Discussion
mailing list