[ python-Bugs-1455676 ] Simplify using Queues with consumer threads

SourceForge.net noreply at sourceforge.net
Wed Mar 22 02:42:37 CET 2006


Bugs item #1455676, was opened at 2006-03-21 16:36
Message generated for change (Comment added) made by tim_one
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1455676&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Threads
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Tim Peters (tim_one)
Summary: Simplify using Queues with consumer threads

Initial Comment:
When Queues are used to communicate between producer 
and consumer threads, there is often a need to 
determine when all of the enqueued tasks have been 
completed.

With this small patch, determining when all work is 
done is as simple as adding q.task_done() to each 
consumer thread and q.join() to the main thread.

Without the patch, the next best approach is to count 
the number of puts, create a second queue filled by 
the consumer when a task is done, and for the main 
thread to call successive blocking gets on the result 
queue until all of the puts have been accounted for:

    def worker(): 
        while 1: 
            task = tasks_in.get() 
            do_work(task) 
            tasks_out.put(None)

    tasks_in = Queue() 
    tasks_out = Queue() 
    for i in range(num_worker_threads): 
         Thread(target=worker).start()

    n = 0 
    for elem in source():
        n += 1
        tasks_in.put(elem) 

    # block until tasks are done 
    for i in range(n): 
        tasks_out.get()

That approach is not complicated but it does entail 
more lines of code and tracking some auxiliary data.
This becomes cumersome and error-prone when an app 
has multiple occurences of q.put() and q.get().

The patch essentially encapsulates this approach into 
two methods, making it effortless to use and easy to 
graft on to existing uses of Queue. So, the above 
code simplies to:

    def worker(): 
        while 1: 
            task = q.get() 
            do_work(task) 
            q.task_done() 

    q = Queue() 
    for i in range(num_worker_threads): 
         Thread(target=worker).start() 

    for elem in source():
        q.put(elem) 

    # block until tasks are done 
    q.join() 

The put counting is automatic, there is no need for a 
separate queue object, the code readably expresses 
its intent with clarity.  Also, it is easy to inpect 
for accuracy, each get() followed by a task_done().  
The ease of inspection remains even when there are 
multiple gets and puts scattered through the code (a 
situtation which would become complicated for the two 
Queue approach).

If accepted, will add docs with an example.

Besides being a fast, lean, elegant solution, the 
other reason to accept the patch is that the 
underlying problem appears again and again, requiring 
some measure to invention to solve it each time.  
There are a number of approaches but none as simple, 
fast, or as broadly applicable as having the queue 
itself track items loaded and items completed.

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2006-03-21 20:42

Message:
Logged In: YES 
user_id=31435

Yup, I'll try to make time tomorrow (can't today). 
_Offhand_ it sounds like a nice addition to me.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2006-03-21 17:27

Message:
Logged In: YES 
user_id=80475

Tim, do you have a chance to look at this?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1455676&group_id=5470


More information about the Python-bugs-list mailing list