threading.Thread.join(timeout=...) considered unreliable

Dennis Lee Bieber wlfraed at ix.netcom.com
Tue Oct 15 01:15:06 EDT 2002


Will Stuyvesant fed this fish to the penguins on Monday 14 October 2002 
11:22 am:

        Okay, my earlier attempt also shows this behavior.

        After much experimenting I think I've isolated it to...

        Your number cruncher likely calls C-language modules for the 
crunching. The C-language modules don't take part in Python's threading 
scheduler, so the earliest a thread swap can take place is when the 
function returns.

        Here is my last test -- note the behavior of the last test function, 
which contains a loop with small multiplication, rather than the ** 
function. Following it is the screen grab of a run.

        The thread scheduler in Python
,----[ /home/wulfraed/tt2.py ]
| from time import sleep, time
| import threading
| import Queue
| 
| start = 0.0
| 
| def Worker(*args, **kwargs):
|     me = threading.currentThread().getName()
|     retq = kwargs["retq"]
|     func = kwargs["ToDo"]
|     del kwargs["retq"]
|     del kwargs["ToDo"]
|     print me, " Running func", (time()-start) * 1000.0
|     rtn = func(*args, **kwargs)
|     print me, " returned", rtn, (time()-start) * 1000.0
|     retq.put((me, rtn))
| 
| def rcall(ms, func, *ar, **kw):
|     q = Queue.Queue()
|     kw["retq"] = q
|     kw["ToDo"] = func
|     w = threading.Thread(target=Worker, args=ar, kwargs=kw)
|     w.start()
|     print w.getName(), " rcall waiting", (time()-start) * 1000.0
|     w.join(float(ms) / 1000.0)
|     print w.getName(), " join expired", (time()-start) * 1000.0
|     if q.empty():
|         print "Queue is empty"
|         return "TimeOut"    #   leaves a dangling Queue and Worker
|         Thread
|     else:
|         print "Queue has data"
|         return q.get()
| 
| 
| 
| #   dummy func
| 
| def testFunctionA(a, b, c, d=2):
|         sleep(3.0)   # 3 seconds
|         return a+b*c+d
| 
| def testFunctionB(a):
|         # takes about 4 seconds with a == 19 on my computer,
|         # keeps the cpu very busy
|         return (2L ** 2L ** a) <> 12345L
| 
| def testFunctionC(a):
|         # takes about 4 seconds with a == 19 on my computer,
|         # keeps the cpu very busy
|         tmp = 2L
|         for t2 in range(a):
|             tmp = tmp * 2L
|         while tmp > 0:
|             t2 = 2L * t2
|             tmp = tmp - 1
|         return t2 <> 12345L
| 
| 
| if __name__ == "__main__":
|     start = time()
|     print rcall(50, testFunctionA, 1, 2, 3, 4)
|     print (time() - start) * 1000.0
|     print " "
|     start = time()
|     print rcall(500, testFunctionA, 1, 2, 3)
|     print (time() - start) * 1000.0
|     print " "
|     start = time()
|     print rcall(50, testFunctionB, 19)
|     print (time() - start) * 1000.0
|     print " "
|     start = time()
|     print rcall(500, testFunctionC, 19)
|     print (time() - start) * 1000.0
|     print "\n\n\n"
|     

`----

        Now for the output (lines with # are not part of the output):

[wulfraed at beastie wulfraed]$ python tt2.py
Thread-1  rcall waiting 27.9159545898
Thread-1  Running func 37.850022316
Thread-1  join expired 87.7649784088
Queue is empty
TimeOut
88.3300304413

# not sure why it took 27msec to get to the wait, but 87-27 => 60msec
# (for a 50msec timeout)

Thread-2  rcall waiting 19.4200277328
Thread-2  Running func 20.1630592346
Thread-2  join expired 529.172062874
Queue is empty
TimeOut
529.335975647

# change in arguments, 529 - 19 => 510msec (500msec timeout)

Thread-3  rcall waiting 1.05202198029
Thread-3  Running func 1.62100791931
Thread-3  join expired 12744.1780567
Queue is empty
TimeOut
12744.3389893

# your exponential equation, apparently the function took 12743msec
# before it returned from the C library, at which point the join
# timed out and the return data queue was still empty (because
# the worker task hasn't regained control to put the data into it)

Thread-4  rcall waiting 1.07705593109
Thread-1  returned 11 1.42109394073
Thread-2  returned 9 1.82807445526

# note how the first two rcall functions have finally completed
# ignore the timing, since the "start time" was reset for the start
# of rcall 4

Thread-4  Running func 2.53200531006
Thread-3  returned 1 46.63002491

# here's where the third rcall function finished

Thread-4  join expired 505.463004112
Queue is empty
TimeOut
505.625009537

# with the python loop, a thread swap took place and the join timed out
# as expected ~500msecs.
# BUT! thread 4 is still looping, and until it completes the main
# program itself can not exit!


IMPORTANT NOTICE -- Even though ALL cases resulted in a TIMEOUT return, 
the program itself cannot exit until thread-4 finishes its calculation!

        At this point in time, it's using over 50% of my CPU, and has been 
running for 17 minutes! Main program just doing nothing but waiting for 
the unkillable thread to finish.


        From "Python Essential Reference":

"... Furthermore, thread switching can only occur between the execution 
of individual bytecodes in the interpreter. ... By default, the 
interpreter checks for thread switching after every 10 bytecode 
instructions."

"..., the interpreter may invoke functions written in C. Unless 
specifically written to interact with a threaded Python interpreter, 
these functions block the execution of all other threads until they 
complete execution. Thus, a long-running calculation in an extension 
module may limit the effectiveness of using threads. ..."



-- 
--  
 > ============================================================== <
 >   wlfraed at ix.netcom.com  | Wulfraed  Dennis Lee Bieber  KD6MOG <
 >      wulfraed at dm.net     |       Bestiaria Support Staff       <
 > ============================================================== <
 >        Bestiaria Home Page: http://www.beastie.dm.net/         <
 >            Home Page: http://www.dm.net/~wulfraed/             <



More information about the Python-list mailing list