Thread limits?

Sun Aug 20 15:04:24 EDT 2000

[Tim, suggests setting up a small pool of "server threads"]
> Use the Queue (std library) module to enter requests for work.  Spin
> off a small number of threads, each structured like so:
>
> while 1:
>     request = your_queue_object.get()  # sleeps until queue is non-empty
>     request.deal_with_it()   # do whatever work it asks for

[François Pinard]
> This gave me a few conceptual problems.

Luckily, those are much easier to bear than real problems <wink>.

> Suppose I'm done with the main application (that is, it
> distributed all the work it had to) and I want to send each
> serving thread (N of them) a command so it terminates.

for i in range(N):
    your_queue_object.put(object_whose_Deal_With_It_means_Quit)

> At first, one could just queue N termination requests in the queue,
> right after all the real work requests.  We rely on the fact
> each thread will never process more than one termination request :-).

Yes, that works fine.

> May the main application just terminate then,

If you used the threading module (*not* the "thread" module!) to create all
your threads, the main thread will automatically wait until all child
threads (that haven't been marked "daemons" -- see the docs) have terminated
before it exits.

> or should it first wait to see that the request queue got empty
> (meaning that all server threads terminated)?

That isn't reliable:  that the queue is empty merely means that all requests
have been removed from it, not that even a single one of them has been
*acted* upon yet.  A common and reliable technique is to save a list of the
Thread objects you've created, and then do

    for thread in my_list_of_thread_objects:
        thread.join()

Again, this requires using the threading module (the "thread" module is very
low-level, and much harder to use safely).

> ...
> But the real problem is that I read somewhere than the queued-ness
> (first-in first-out behaviour) of the Queue class is not guaranteed.  So,
> it may well happen that server threads receive the termination request
> early, and all terminate prematurely, leaving the queue forever non-empty.

Well, I haven't seen that article, but it or you are probably confused
<wink>.  As in Relativity Theory too, the concept of "after" is actually
very tricky in multi-threaded programs, and is darned hard to define except
from the point of view of a specific and fixed observer.  The Queue class is
"sequentially consistent" from the point of view of any single thread.  To
make that concrete, if thread T queues X and later (the same thread T!)
queues Y, X *will* be removed from the queue before Y is.  So if the only
thing enqueueing requests is the "main thread", it can be sure that all work
requests are removed from the queue before any termination requests,
provided only that it doesn't screw up itself by enqueueing another work
request after it enqueues the termination requests.

For God's point of view, things aren't necessarily that simple if more than
one thread is putting things on the queue.  For example, by God's clock it
may be that thread T reaches a q.put(X) line at time G, and thread U a
q.put(Y) line at time G+1, but that Y actually gets onto the queue at time
G+2 and X not until time G+1000000.  The only way to get threads to *agree*
on "what happened first" is to build protocols on top of synchronization
gimmicks (like Events and Conditions and Locks -- btw, this is the deep
reason they're *called* "synchronization" gimmicks).  Each thread has its
own self-consistent sense of time, but unrelated to any other thread's sense
of time unless you force them to "sync up".

> This is why I'm staying away from the Queue class for now, until
> I find some elegant way to handle the termination of the server
> threads.  Any opinion or suggestion you might have is welcome,
> of course! :-)

Provided you only want your main thread to exit after the server threads
die, use the threading module and you're all set "by magic".  Else use the
".join() in a loop" idiom above (if you study the source for threading.py,
you'll see that the "main thread" does this itself in hidden class
_MainThread, via registering an exit handler to invoke
_MainThread.__exitfunc:  that just loops over the threads remaining, doing a
.join() with each).

all-obvious-to-the-most-casual-observer<wink>-ly y'rs  - tim