[Numpy-discussion] numpy.random and multiprocessing

Gael Varoquaux gael.varoquaux at normalesup.org
Thu Dec 11 11:39:14 EST 2008


On Thu, Dec 11, 2008 at 05:23:12PM +0100, Sturla Molden wrote:
> On 12/11/2008 4:57 PM, David Cournapeau wrote:

> > Why do you say the results are the same ? They don't look the same to
> > me - only the first three are the same.

> He used the multiprocessing.Pool object. There is a possible race 
> condition here: one or more of the forked processes may be doing 
> nothing. They are all competing for tasks on a queue. It could be 
> avoided by using multiprocessing.Process instead.

No, Pool is what I want, because in my production code I am submitting
jobs to that pool.

> > I am not sure I am following: the objects in python are not the same
> > if you fork a process, or I don't understand what you mean by same.
> > They may be initialized the same way, though.

> When are they initialized? On import numpy or the first call to 
> numpy.random.random?

mtrand.pyx seems pretty clear about that: on import.

> If they are initialized on the import numpy statement, they are
> initalized prior to forking and sharing state. This is because his
> statement 'from test_helper import task' actually triggers the import
> of numpy, and it occurs prior to any fork.

This is what I thought too. However, inserting a sleep statement
long-enough in my spawning loop recovers entropy. I am confused.

Gaël



More information about the NumPy-Discussion mailing list