Producer-consumer threading problem

Larry Bates larry.bates at websafe.com`
Tue Jun 10 23:47:53 EDT 2008


George Sakkis wrote:
> I'd like some feedback on a solution to a variant of the producer-
> consumer problem. My first few attempts turned out to deadlock
> occasionally; this one seems to be deadlock-free so far but I can't
> tell if it's provably correct, and if so, whether it can be
> simplified.
> 
> The generic producer-consumer situation with unlimited buffer capacity
> is illustrated at http://docs.python.org/lib/condition-objects.html.
> That approach assumes that the producer will keep producing items
> indefinitely, otherwise the consumer ends up waiting forever. The
> extension to the problem I am considering requires the consumer to be
> notified not only when there is a new produced item, but also when
> there is not going to be a new item so that it stops waiting. More
> specifically, I want a generator (or iterator class) with the
> following generic signature:
> 
> def iter_consumed(items, produce, consume):
>     '''Return an iterator over the consumed items.
> 
>     :param items: An iterable of objects to be `produce`()d and
> `consume`()d.
> 
>     :param produce: A callable `f(item)` that produces a single item;
> the return
>         value is ignored. What "produce" exactly means is application-
> specific.
> 
>     :param consume: A callable `f()` that consumes a previously
> produced item
>         and returns the consumed item. What "consume" exactly means is
>         application-specific. The only assumption is that if `produce`
> is called
>         `N` times, then the next `N` calls to `consume` will
> (eventually, though
>         not necessarily immediatelly) return, i.e they will not block
> indefinitely.
>     '''
> 
> One straightforward approach would be to serialize the problem: first
> produce all `N` items and then call consume() exactly N times.
> Although this makes the solution trivial, there are at least two
> shortcomings. First, the client may have to wait too long for the
> first item to arrive. Second, each call to produce() typically
> requires resources for the produced task, so the maximum resource
> requirement can be arbitrarily high as `N` increases. Therefore
> produce() and consume() should run concurrently, with the invariant
> that the calls to consume are no more than the calls to produce. Also,
> after `N` calls to produce and consume, neither should be left
> waiting.
> 
> I pasted my current solution at http://codepad.org/FXF2SWmg. Any
> feedback, especially if it has to do with proving or disproving its
> correctness, will be appreciated.
> 
> George

I had a little trouble understanding what exact problem it is that you are 
trying to solve but I'm pretty sure that you can do it with one of two methods:

1) Write the producer as a generator using yield method that yields a result 
every time it is called (something like os.walk does).  I guess you could yield 
None if there wasn't anything to consume to prevent blocking.

2) Usw somethink like Twisted insted that uses callbacks instead to handle 
multiple asynchronous calls to produce.  You could have callbacks that don't do 
anything if there is nothing to consume (sort of null objects I guess).

I don't know if either of these help or not.

-Larry



More information about the Python-list mailing list