<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Oops. I accidentally replied off-list:<div><br></div><div><br>On Dec 10, 2010, at 5:36 AM, Thomas Nagy wrote:<br><br><blockquote type="cite">--- El jue, 9/12/10, Brian Quinlan escribió:<br></blockquote><blockquote type="cite"><blockquote type="cite">On Dec 9, 2010, at 4:26 AM, Thomas Nagy wrote:<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">I am looking forward to replacing a piece of code (<a href="http://code.google.com/p/waf/source/browse/trunk/waflib/Runner.py#86">http://code.google.com/p/waf/source/browse/trunk/waflib/Runner.py#86</a>)<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">by the futures module which was announced in python 3.2<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">beta. I am a bit stuck with it, so I have a few questions<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">about the futures:<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">1. Is the futures API frozen?<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Yes.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">2. How hard would it be to return the tasks processed<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">in an output queue to process/consume the results while they<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">are returned? The code does not seem to be very open for<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">monkey patching.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">You can associate a callback with a submitted future. That<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">callback could add the future to your queue.<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Ok, it works. I was thinking the object was cleaned up immediately after it was used.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">3. How hard would it be to add new tasks dynamically<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">(after a task is executed) and have the futures object never<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">complete?<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">I'm not sure that I understand your question. You can<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">submit new work to an Executor at until time until it is<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">shutdown and a work item can take as long to complete as you<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">want. If you are contemplating tasks that don't complete<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">then maybe you could be better just scheduling a thread.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite">4. Is there a performance evaluation of the futures<br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">code (execution overhead) ?<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">No. Scott Dial did make some performance improvements so he<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">might have a handle on its overhead.<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Ok.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">I have a process running for a long time, and which may use futures of different max_workers count. I think it is not too far-fetched to create a new futures object each time. Yet, the execution becomes slower after each call, for example with&nbsp;<a href="http://freehackers.org/~tnagy/futures_test.py">http://freehackers.org/~tnagy/futures_test.py</a>:<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">"""<br></blockquote><blockquote type="cite">import concurrent.futures<br></blockquote><blockquote type="cite">from queue import Queue<br></blockquote><blockquote type="cite">import datetime<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">class counter(object):<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;def __init__(self, fut):<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.fut = fut<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;def run(self):<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;def look_busy(num, obj):<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tot = 0<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for x in range(num):<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tot += x<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;obj.out_q.put(tot)<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;start = datetime.datetime.utcnow()<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.count = 0<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.out_q = Queue(0)<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for x in range(1000):<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.count += 1<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.fut.submit(look_busy, self.count, self)<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;while self.count:<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.count -= 1<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;self.out_q.get()<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;delta = datetime.datetime.utcnow() - start<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;print(delta.total_seconds())<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">fut = concurrent.futures.ThreadPoolExecutor(max_workers=20)<br></blockquote><blockquote type="cite">for x in range(100):<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;# comment the following line<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;fut = concurrent.futures.ThreadPoolExecutor(max_workers=20)<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;c = counter(fut)<br></blockquote><blockquote type="cite">&nbsp;&nbsp;&nbsp;c.run()<br></blockquote><blockquote type="cite">"""<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">The runtime grows after each step:<br></blockquote><blockquote type="cite">0.216451<br></blockquote><blockquote type="cite">0.225186<br></blockquote><blockquote type="cite">0.223725<br></blockquote><blockquote type="cite">0.222274<br></blockquote><blockquote type="cite">0.230964<br></blockquote><blockquote type="cite">0.240531<br></blockquote><blockquote type="cite">0.24137<br></blockquote><blockquote type="cite">0.252393<br></blockquote><blockquote type="cite">0.249948<br></blockquote><blockquote type="cite">0.257153<br></blockquote><blockquote type="cite">...<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Is there a mistake in this piece of code?<br></blockquote><br>There is no mistake that I can see but I suspect that the circular references that you are building are causing the ThreadPoolExecutor to take a long time to be collected. Try adding:<br><br><span class="Apple-tab-span" style="white-space: pre; ">        </span>c = counter(fut)<br><span class="Apple-tab-span" style="white-space: pre; ">        </span>c.run()<br>+<span class="Apple-tab-span" style="white-space: pre; ">        </span>fut.shutdown()<br><br>Even if that fixes your problem, I still don't fully understand these numbers because I would expect the runtime to fall after a while as ThreadPoolExecutors are collected.<br><br>Cheers,<br>Brian<br><br><br><blockquote type="cite">Thanks,<br></blockquote><blockquote type="cite">Thomas<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote></div></body></html>