Persistent Threads & Synchronisation

Sun Nov 26 06:30:50 EST 2006

I appear to be having some problems with the isAlive() method of
detecting if a thread is alive/active/running or not.  I'd be grateful
for any advice.

I have a visualisation program (which uses PyGame Extended [1]) that
presents content to the user and is meant to download the next batch of
content whilst the current one is being displayed.  To achieve this, I
created a thread to perform the downloading in the background (a class
based on threading.Thread).

They way I want it to work is this:  The downloading thread, when
spawned, stays alive for the duration of the program.  Occasionally the
main program will call a function in it to download the data and save it
as files on disk.  Then, these files are loaded by the main thread.
When this has happened, the main thread calls another function in the
download thread to delete the temporary files.  The same thing happens
next time a download is needed, when the user is looking at some other
content.

My problem is this: the downloading thread only seems to execute code
in a separate thread as long as the run() function (called by the
start() function) is running.  This is as per the documentation, of
course.  I am performing the download in the run() function, but the
file cleanup is still done with a separate call.  This actually does
work, after the download is over, and run() has terminated, but I
believe it isn't happening in a separate thread anymore (as previously I
made a mistake and called run() directly, instead of start() and it
blocked the main program).

I have been using join() to wait until the download is complete at the
point in the main program where it is absolutely necessary for the
download to have finished.  I'm trying to use the following code to
restart the downloading thread when I next need to use it:

    if not self.preview_thread.isAlive():
        self.preview_thread.start()

Unfortunately, isAlive() sometimes returns False when the thread is
actually still running.  This means that my code crashes out, on the
assertion in threading.Thread that self.__started is False.  The
documentation [2] explains that the issue of threads being ``alive''
has ``intentionally been left vague'', which doesn't inspire confidence
:-S.

To work around this problem, I was thinking of doing this:

    class PreviewThread(threading.Thread):
        . . .
        def run(self):
            self.download_data()
            while not self.temp_files_cleaned:
                pass
            return
        . . .
        def cleanup(self):
            . . .
            self.temp_files_cleaned = True

This should allow the thread to remain alive until after the download
*and* cleanup have been completed, so that I can be sure it's safe to
restart it when I next need to download more data.

My ultimate question: is this the right way to do things?  I don't like
the idea of making a loop such as that in the proposed run() function
above, it seems inefficient.  I am also concerned that it won't let me
call other functions in the thread, perhaps.

I did wonder about using a Condition, but that seems to be more suited
for synchronising between threads, which isn't really the issue here
(and thus it seems like overkill for solving the problem of replacing
that loop with something more efficient, though could be a possibility,
I suppose).

It seems I'm missing something big and possibly obvious about how to
make threads persistent.  I would like to know ``the right way'' to do
it and would really appreciate some advice on the subject.

Many thanks in advance!

[1] http://codereactor.net/projects/pygext/
[2] http://docs.python.org/lib/thread-objects.html
-- 
Matthew Tylee Atkinson <matthew at agrip.org.uk>