using queue

Tim Arnold tim.arnold at sas.com
Wed Sep 2 13:00:51 EDT 2009


"MRAB" <python at mrabarnett.plus.com> wrote in message 
news:mailman.835.1251886213.2854.python-list at python.org...
> Tim Arnold wrote:
>> Hi, I've been using the threading module with each thread as a key in a 
>> dictionary. I've been reading about Queues though and it looks like 
>> that's what I should be using instead. Just checking here to see if I'm 
>> on the right path.
>> The code I have currently compiles a bunch of chapters in a book (no more 
>> than 80 jobs at a time) and just waits for them all to finish:
>>
>>         max_running = 80
>>         threads = dict()
>>         current = 1
>>         chaps = [x.config['name'] for x in self.document.chapter_objects]
>>         while current <= len(chaps):
>>             running = len([x for x in threads.keys() if 
>> threads[x].isAlive()])
>>             if running == max_running:
>>                 time.sleep(10)
>>             else:
>>                 chap = chaps[current - 1]
>>                 c = self.compiler(self.document.config['name'], chap)
>>                 threads[chap] = threading.Thread(target=c.compile)
>>                 threads[chap].start()
>>                 current += 1
>>
>>         for thread in threads.keys():
>>             threads[thread].join(3600.0)
>> ---------------------------------
>> but I think Queue could do a lot of the above work for me. Here is 
>> pseudocode for what I'm thinking:
>>
>> q = Queue(maxsize=80)
>> for chap in [x.config['name'] for x in self.document.chapter_objects]:
>>     c = self.compiler(self.document.config['name'], chap)
>>     t = threading.Thread(target=c.compile)
>>     t.start()
>>     q.put(t)
>> q.join()
>>
>> is that the right idea?
>>
> I don't need that many threads; just create a few to do the work and let
> each do multiple chapters, something like this:
>
> class CompilerTask(object):
>     def __init__(self, chapter_queue):
>         self.chapter_queue = chapter_queue
>     def __call__(self):
>         while True:
>             chapter = self.chapter_queue.get()
>             if chapter is None:
>                 # A None indicates that there are no more chapters.
>                 break
>             chapter.compile()
>         # Put back the None so that the next thread will also see it.
>         self.chapter_queue.put(None)
>
> MAX_RUNNING = 10
>
> # Put the chapters into a queue, ending with a None.
> chapter_queue = Queue()
> for c in self.document.chapter_objects:
>     chapter_queue.put(self.compiler(self.document.config['name'], 
> c.config['name']))
> chapter_queue.put(None)
>
> # Start the threads to do the work.
> for i in range(MAX_RUNNING):
>     t = threading.Thread(target=CompilerTask(chapter_queue))
>     t.start()
>     thread_list.append(t)
>
> # The threads will finish when they see the None in the queue.
> for t in thread_list:
>     t.join()
>

hi, thanks for that code. It took me a bit to understand what's going on, 
but I think I see it now.
Still, I have two questions about it:
(1) what's wrong with having each chapter in a separate thread? Too much 
going on for a single processor? I guess that probably doesn't matter at 
all, but some chapters run in minutes and some in seconds.
(2) The None at the end of the queue...I thought t.join() would just work. 
Why do we need None?

thanks for thinking about this,
--Tim Arnold





More information about the Python-list mailing list