
I'm not sure that I'd agree with the simpler API part though :-)
I was referring to your old API. Still, we are both obviously very biased here :-p
Does ThreadPool using some sort of balancing strategy if poolsize where set to < len(URLs)?
Yes, of course! Otherwise it wouldn't really qualify as a pool.
"retrieve" seems to take multiple url arguments.
Correct. `retrieve` is simply a generator that retrieve URLs sequentially, the ThreadPool distributes the input stream so that each workers get an iterator over its work load.
If delicate job control is necessary, an Executor can be used. It is implemented on top of the pool, and offers submit(*items) which returns job ids to be used for cancel() and status(). Jobs can be submitted and canceled concurrently.
What type is each "item" supposed to be?
Whatever your iterator-processing function is supposed to process. The URLs example can be written using an Executor as: e = Executor(ThreadPool, retrieve) e.submit(*URLs) e.close() print list(e.result)
Can I wait on several items?
Do you mean wait for several particular input values to be completed? As of this moment, yes but rather inefficiently. I have not considered it is a useful feature, especially when taking a wholesale, list-processing view: that a worker pool process its input stream _out_of_order_. If you just want to wait for several particular items, it means you need their outputs _in_order_, so why do you want to use a worker pool in the first place? However, I'd be happy to implement something like Executor.submit(*items, wait=True). Cheers, aht