[Python-ideas] Restartable Threads

Sun Mar 2 23:02:31 CET 2008

Hi everyone,

Why is it that threads can't be restarted?

I hope this is the right place for this discussion. If this has been
(or should be) discussed somewhere else, I apologize: my searches for
"restart thread" and similar only turned up statements that restarting
threads is impossible, which haven't satiated my curiosity.

Is there any fundamental reason why this can't (or shouldn't) be done?
If not, what would you think of making thread restartability an
option?

For those who are wondering why I might wish to condemn myself by
using threads at all (rather than, say, subprocesses), never mind
threads that can be restarted while maintaining state, my use case is
as follows:

I am performing low-level hardware control through a C API that I have
wrapped with ctypes. The main "run"-type C function takes a pointer to
a struct in which it stores information about its state. This function
blocks while the system is running, but also needs access to the
shared memory space, and thus has to be executed in its own thread (I
believe - other options welcomed). In order to signal events, the
function returns with a signal code. Once the signal has been dealt
with, the API specifies that the same C function call be made, passing
the pointer to the original struct, so that the system can resume
operation where it left off.

I'm sure this could be done using a standard thread (although I
haven't actually done it) with something like:

def myloop():
    while not self.ret == 0:
        self.resume_evt.clear()
        self.ret = sharedLib.blocking_call(self.c_state_struct)
        self.signal_evt.set()
        self.resume_evt.wait()

t.Thread(target=myloop)
... do some things ...
t.signal_evt.wait()
... deal with the signal ...
t.signal_evt.clear()
t.resume_evt.set()

Or some such ugliness, but it seemed to me that the most natural
implementation of such a system would be something more like:

class myThread(Thread):
    def __init__(self):
        self.c_state_struct = structMaker()
        Thread.__init__(self)
    def run(self):
        self.ret = sharedLib.blocking_call(self.c_state_struct)

which would then be executed with:

>>> t = myThread()
>>> t.start()
... do some other stuff ...
>>> t.join()
>>> signal_handle(t.ret)  # deal with the returned value
>>> t.start()                     # resume operation

However this is impossible, since a thread's start() method can only
be called once (as explained in [1], [2] and [3] python2.5 raises an
assertion error, although as of rev 55785 this has been changed to a
RuntimeError). What I have been unable to find explained, however, is
why this should/needs to be the case.

To see if I could get around this limitation, I initially hacked this together:

class myThread(Thread):
	def __init__(self):
		self.i = 1
		Thread.__init__(self)

	def start(self):
		Thread.__init__(self)
		Thread.start(self)

	def run(self):
		print self.i
		self.i += 1
		return self.i

to be used as:

>>> t = myThread()	
>>> t.start()
>>> t.join()
1
>>> t.start()
>>> t.join()
2

Obviously it is not usable in the general case, since it completely
clobbers the thread's internal state through the repeated __init__()s,
but one could certainly imagine a more delicate implementation that
saves the relevant bits and pieces, while resetting those that need
it.

With that in mind, I had a look into threading.py and, not immediately
seeing any reason this couldn't be done, implemented essentially that
functionality. The attached patch is implemented against threading.py
from trunk. I've also uploaded a patched copy of my threading.py that
can be used with python2.5 to [4], if anyone needs that.

In order to maintain complete backward compatibility, I've left the
default behaviour to have threads behave as they do today, but by
initializing them with "restartable=True", start() can be called
repeatedly. For example:

class Counter(Thread):
	def run(self):
		if not hasattr(self, "count"):
			self.count = 0
		else:
			self.count += 1

could be used with:

>>> t = Counter(restartable=True)
>>> t.start()
>>> t.join()
>>> print t.count
0
>>> t.start()
>>> t.join()
>>> print t.count
1

If an attempt is made to restart the thread while it is executing, it
still raises a RuntimeError, which I think makes sense:

>>> t = LongThread(restartable=True)
>>> t.start()
>>> t.start()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "threading.py", line 441, in start
    raise RuntimeError("thread already started")
RuntimeError: thread already started

So this _seems_ to work, but I have to admit, I'm somewhat afraid to
use it. I can't help but wonder: is it safe, or is it tempting the
Gods of parallelism to inflict sudden, multi-threaded death?

Less superstitious opinions than my own would be greatly appreciated.

Thanks,

-Gabriel

Note: In addition to the patch, I have attached a few usage
examples/test cases, that I should really make into actual unit tests.
Some of these are expected to fail, so the file can't be executed
directly - the examples should be run in an interpreter.

[1]: http://docs.python.org/lib/thread-objects.html
[2]: http://mail.python.org/pipermail/python-list/2006-November/415503.html
[3]: http://lookherefirst.wordpress.com/2007/12/20/can-i-restart-a-thread-in-python/
[4]: http://ieeesb.mcmaster.ca/~grantgm/reThread/threading.py
-------------- next part --------------
A non-text attachment was scrubbed...
Name: threading-restartable-thread.patch
Type: text/x-patch
Size: 1257 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20080302/e44a6038/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: restartable_Thread_examples.py
Type: text/x-python
Size: 1635 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20080302/e44a6038/attachment.py>