Multiprocessing.Queue deadlock

MRAB python at mrabarnett.plus.com
Wed Oct 7 12:16:46 EDT 2009


Felix wrote:
> Hello,
> 
> I keep running into a deadlock in a fairly simple parallel script
> using Multiprocessing.Queue for sending tasks and receiving results.
>>From the documentation I cannot figure out what is happening and none
> of the examples seem to cover quite what I am doing. The main code is
> 
> results = mp.Queue()
> tasks = mp.JoinableQueue()
> tasks.put( (0,0) )
> procs = [ mp.Process(target=work, args=(tasks, results)) for i in range
> (nprocs)]
> for p in procs:
>     p.daemon = True
>     p.start()
> 
> tasks.join()
> for i in range(nprocs): tasks.put('STOP')
> for p in procs: p.join()
> res=[]
> while 1:
>     try:
>         res.append(res.get(False))
>     except Empty: break
> 
> 
> The function 'work' both consumes tasks adding the results to the
> output queue and adds new tasks to the input queue based on its
> result.
> 
> def work(tasks, results):
>     for task in iter(tasks.get, 'STOP'):
>         res = calc(*task)
>         if res:
>             results.put(res)
> 	    tasks.put((task[0], res[1]))
> 	    tasks.put((res[0],task[1]))
>        queue.task_done()
> 
> This program will hang while the main process joins the workers (after
> all results are computed, i.e. after tasks.join() ). The workers have
> finished function 'work', but have not terminated yet.
> 
> Calling results.cancel_join_thread as a last line in 'work' prevents
> the deadlocks, as does terminating the workers directly. However I am
> not sure why that would be needed and if it might not make me loose
> results.
> 
> It seems to be the workers cannot finish pusing buffered results into
> the output queue when calling 'results.join_thread' while terminating,
> but why is that? I tried calling 'results.close()' before joining the
> workers in the main process, but it does not make a difference.
> 
> Is there something I am understanding wrong about the interface? Is
> there a much better way to do what I am trying to do above?
> 
It think it's down to the difference between multithreading and
multiprocessing.

When multithreading, the threads share the same address space, so items
can be passed between the threads directly.

However, when multiprocessing, the processes don't share the same
address space, so items need to be passed from process to process via a
pipe. Unfortunately, the pipe has a limited capacity, so if a process
doesn't read from one end then the pipe will eventually fill up and the
sender will block. Also, a process won't terminate until it has finished
writing to the pipe, and it can't be joined until it has terminated.

You can therefore get into a deadlock where:

* Process A won't read from the queue until it has joined process B.
* The join won't succeed until process B has terminated.
* Process B won't terminate until it has finished writing to the queue.
* Process B can't finish writing to the queue because it's full.
* The queue is full because process A isn't reading from it.



More information about the Python-list mailing list