data:image/s3,"s3://crabby-images/5f6b4/5f6b491af5df0de6a1ae147031b8bf74375954ef" alt=""
Hi everyone, Why is it that threads can't be restarted? I hope this is the right place for this discussion. If this has been (or should be) discussed somewhere else, I apologize: my searches for "restart thread" and similar only turned up statements that restarting threads is impossible, which haven't satiated my curiosity. Is there any fundamental reason why this can't (or shouldn't) be done? If not, what would you think of making thread restartability an option? For those who are wondering why I might wish to condemn myself by using threads at all (rather than, say, subprocesses), never mind threads that can be restarted while maintaining state, my use case is as follows: I am performing low-level hardware control through a C API that I have wrapped with ctypes. The main "run"-type C function takes a pointer to a struct in which it stores information about its state. This function blocks while the system is running, but also needs access to the shared memory space, and thus has to be executed in its own thread (I believe - other options welcomed). In order to signal events, the function returns with a signal code. Once the signal has been dealt with, the API specifies that the same C function call be made, passing the pointer to the original struct, so that the system can resume operation where it left off. I'm sure this could be done using a standard thread (although I haven't actually done it) with something like: def myloop(): while not self.ret == 0: self.resume_evt.clear() self.ret = sharedLib.blocking_call(self.c_state_struct) self.signal_evt.set() self.resume_evt.wait() t.Thread(target=myloop) ... do some things ... t.signal_evt.wait() ... deal with the signal ... t.signal_evt.clear() t.resume_evt.set() Or some such ugliness, but it seemed to me that the most natural implementation of such a system would be something more like: class myThread(Thread): def __init__(self): self.c_state_struct = structMaker() Thread.__init__(self) def run(self): self.ret = sharedLib.blocking_call(self.c_state_struct) which would then be executed with:
t = myThread() t.start() ... do some other stuff ... t.join() signal_handle(t.ret) # deal with the returned value t.start() # resume operation
However this is impossible, since a thread's start() method can only be called once (as explained in [1], [2] and [3] python2.5 raises an assertion error, although as of rev 55785 this has been changed to a RuntimeError). What I have been unable to find explained, however, is why this should/needs to be the case. To see if I could get around this limitation, I initially hacked this together: class myThread(Thread): def __init__(self): self.i = 1 Thread.__init__(self) def start(self): Thread.__init__(self) Thread.start(self) def run(self): print self.i self.i += 1 return self.i to be used as:
t = myThread() t.start() t.join() 1 t.start() t.join() 2
Obviously it is not usable in the general case, since it completely clobbers the thread's internal state through the repeated __init__()s, but one could certainly imagine a more delicate implementation that saves the relevant bits and pieces, while resetting those that need it. With that in mind, I had a look into threading.py and, not immediately seeing any reason this couldn't be done, implemented essentially that functionality. The attached patch is implemented against threading.py from trunk. I've also uploaded a patched copy of my threading.py that can be used with python2.5 to [4], if anyone needs that. In order to maintain complete backward compatibility, I've left the default behaviour to have threads behave as they do today, but by initializing them with "restartable=True", start() can be called repeatedly. For example: class Counter(Thread): def run(self): if not hasattr(self, "count"): self.count = 0 else: self.count += 1 could be used with:
t = Counter(restartable=True) t.start() t.join() print t.count 0 t.start() t.join() print t.count 1
If an attempt is made to restart the thread while it is executing, it still raises a RuntimeError, which I think makes sense:
t = LongThread(restartable=True) t.start() t.start() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "threading.py", line 441, in start raise RuntimeError("thread already started") RuntimeError: thread already started
So this _seems_ to work, but I have to admit, I'm somewhat afraid to use it. I can't help but wonder: is it safe, or is it tempting the Gods of parallelism to inflict sudden, multi-threaded death? Less superstitious opinions than my own would be greatly appreciated. Thanks, -Gabriel Note: In addition to the patch, I have attached a few usage examples/test cases, that I should really make into actual unit tests. Some of these are expected to fail, so the file can't be executed directly - the examples should be run in an interpreter. [1]: http://docs.python.org/lib/thread-objects.html [2]: http://mail.python.org/pipermail/python-list/2006-November/415503.html [3]: http://lookherefirst.wordpress.com/2007/12/20/can-i-restart-a-thread-in-pyth... [4]: http://ieeesb.mcmaster.ca/~grantgm/reThread/threading.py
data:image/s3,"s3://crabby-images/e6a88/e6a882ea3d44dca99f32f9438b6ea73b534940c5" alt=""
Sorry if I missed it from you email, but why cant you just create another thread object before each start call? I think the only objection to restart a thread would be that the idea is that each thread object represents a thread... but I might be completely wrong. -- Leonardo Santagada
data:image/s3,"s3://crabby-images/5f6b4/5f6b491af5df0de6a1ae147031b8bf74375954ef" alt=""
On Sun, Mar 2, 2008 at 5:11 PM, Leonardo Santagada <santagada@gmail.com> wrote:
Sorry if I missed it from you email, I know the message was a rather long. Sorry about that.
but why cant you just create another thread object before each start call?
The state of the thread needs to be preserved from start() to start() because the C function needs to be passed the same object each time it is called. The state could be maintained by creating the persistent object in the parent thread and passing it to a new child thread before each call, but for a few reasons this feels wrong: It seems to me that this would break encapsulation - objects exist for the purpose of carrying state. They shouldn't rely on their parent to do that for them. Doing so would muck up the parent, especially once there are a) multiple child threads and b) multiple state-carrying objects that need to be maintained within each thread. Also, from a more conceptual point of view, the C function basically represents a single, restartable process, so it seems it should be packaged and used as such. When the function returns, it is more akin to a synchronization point between threads than stopping one and creating then starting another. Hopefully that clarifies my thinking a bit (or at least doesn't muddy the waters any further :)
I think the only objection to restart a thread would be that the idea is that each thread object represents a thread... but I might be completely wrong.
And that may be a valid objection, although the lifetime of the Thread object does not directly correspond with that of the thread it wraps. The thread is created upon calling start(), and dies when run() returns. The way it is implemented, the Thread object is more of a thread creator and controller, than a physical thread. Otherwise, I would think it should disapear after being join()ed. It seem to me that these objects represent a more palatable abstraction of the physical thread...but I might (also :) be completely wrong. Given that we accept (enjoy, even?) some level of abstraction on top of physical threads (for instance we start them after they have been initialized, and we check whether they are running, not whether they exist), it seems reasonable to me that stopping and restarting these conceptual threads should be possible. What do you think? Thanks again for your consideration, -Gabriel
data:image/s3,"s3://crabby-images/ab910/ab910dfd0ccb0b31bd7b786339f0a87c8ebe4c72" alt=""
My 2 cents from my 30 seconds of reading this email thread: encapsulation shouldn't be done on the thread level, it should be done on the object level. Create an object that offers the behavior you want to have (call it ThreadStarter or something), and give it a 'start_thread()' method that returns a thread handle from which you can .join() as necessary. This ThreadStarter object keeps references to the necessary structures that you need to pass to the lower level threads. Or heck, this ThreadStarter could handle the .join() dispatch, etc. If you think about it for 5 minutes, I'm sure you could implement it. Also, while it isn't impossible to "restart threads" the way you conceive of it, your way of conceiving of the "restart" is fundamentally wrong. Can you restart a process whose stack you've thrown away? Of course not. You've thrown away the process/thread's stack (which can be seen by the fact that you can .join() the thread), so you aren't "restarting" the thread, you are creating a new thread with a new stack with some of the same arguments to called functions. - Josiah (this message does not mean that I'm going to be spending much time in this list anymore, just that I saw this silly idea and had to comment) On Sun, Mar 2, 2008 at 3:16 PM, Gabriel Grant <grantgm@mcmaster.ca> wrote:
On Sun, Mar 2, 2008 at 5:11 PM, Leonardo Santagada <santagada@gmail.com> wrote:
Sorry if I missed it from you email, I know the message was a rather long. Sorry about that.
but why cant you just create another thread object before each start call?
The state of the thread needs to be preserved from start() to start() because the C function needs to be passed the same object each time it is called.
The state could be maintained by creating the persistent object in the parent thread and passing it to a new child thread before each call, but for a few reasons this feels wrong:
It seems to me that this would break encapsulation - objects exist for the purpose of carrying state. They shouldn't rely on their parent to do that for them. Doing so would muck up the parent, especially once there are a) multiple child threads and b) multiple state-carrying objects that need to be maintained within each thread.
Also, from a more conceptual point of view, the C function basically represents a single, restartable process, so it seems it should be packaged and used as such. When the function returns, it is more akin to a synchronization point between threads than stopping one and creating then starting another.
Hopefully that clarifies my thinking a bit (or at least doesn't muddy the waters any further :)
I think the only objection to restart a thread would be that the idea is that each thread object represents a thread... but I might be completely wrong.
And that may be a valid objection, although the lifetime of the Thread object does not directly correspond with that of the thread it wraps. The thread is created upon calling start(), and dies when run() returns. The way it is implemented, the Thread object is more of a thread creator and controller, than a physical thread. Otherwise, I would think it should disapear after being join()ed. It seem to me that these objects represent a more palatable abstraction of the physical thread...but I might (also :) be completely wrong.
Given that we accept (enjoy, even?) some level of abstraction on top of physical threads (for instance we start them after they have been initialized, and we check whether they are running, not whether they exist), it seems reasonable to me that stopping and restarting these conceptual threads should be possible. What do you think?
Thanks again for your consideration,
-Gabriel
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
data:image/s3,"s3://crabby-images/98972/989726b670c074dad357f74770b5bbf840b6471a" alt=""
On Sun, Mar 02, 2008, Gabriel Grant wrote:
Why is it that threads can't be restarted?
That's an interesting question. Unfortunately, the best person to answer it isn't on this list (Tim Peters). Generally speaking, the standard answer is to have a worker thread that uses a Queue in a loop.
So this _seems_ to work, but I have to admit, I'm somewhat afraid to use it. I can't help but wonder: is it safe, or is it tempting the Gods of parallelism to inflict sudden, multi-threaded death?
If I had to guess, I think it's just adding an unnecessary layer of complexity to the existing Thread class. Moreover, the existing implementation prevents the following code: t.start() t.run() t.run() which IMO definitely should be a bug. If you want to try creating a patch that that adds a t.restart() method, I think it certainly wouldn't hurt anything and would be a good way of getting feedback. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson
participants (4)
-
Aahz
-
Gabriel Grant
-
Josiah Carlson
-
Leonardo Santagada