[Numpy-discussion] numpy.random and multiprocessing

David Cournapeau david at ar.media.kyoto-u.ac.jp
Thu Dec 11 12:04:30 EST 2008


Michael Gilbert wrote:
>> Exactly, change task_helper.py to
>>
>> ----
>> import numpy as np
>>
>> def task(x):
>>     import os
>>     print "Hi, I'm", os.getpid()
>>     return np.random.random(x)
>> ----
>>
>> and note the output
>>
>> ----
>> Hi, I'm 16197
>> Hi, I'm 16198
>> Hi, I'm 16199
>> Hi, I'm 16199
>> [ 0.58175647  0.16293922  0.30488182  0.67367263]
>> [ 0.58175647  0.16293922  0.30488182  0.67367263]
>> [ 0.58175647  0.16293922  0.30488182  0.67367263]
>> [ 0.59574921  0.61554857  0.06155764  0.75352295]
>>     
>
> Shouldn't numpy (and/or multiprocessing) be smart enough to prevent
> this kind of error?  A simple enough solution would be to also include
> the process id as part of the seed since it appears that the problem
> only occurs when you have different processes/threads accessing the
> random number generator at the same time.
>   

But the seed is set only once in the above code. So the problem has
nothing to do with numpy. I don't think using the pid as a seed is a
good idea either - for each task, it should be set to a true random source.

David



More information about the NumPy-Discussion mailing list