Threading failure

I asked this question here a few days ago, and someone suggested I ask again, providing all the code, so that it can be actually tried. So here it is, and I’m still stumped.
Original post: I have a problem with threading using the Python/C API. I have an extension that implements a timer, and the C++ timer callback function calls a Python function. The code looks like this: // fetimer.cpp : Defines the exported functions for timer function. #define_WIN32_WINNT 0x502
#include"stdafx.h" #include"mmsystem.h" #include"python.h"
#defineTARGET_RESOLUTION 1
UINT wTimerRes,fooval;
BOOL LEDflag,timeFlag; BOOL timerActive = FALSE; BOOL timerSet = FALSE; BOOL funcsetFlag = FALSE; BOOL modsetFlag = FALSE; BOOL attrsetFlag = FALSE; UINT wTimerID,timeval,setVal = 2000; HANDLE porthandle; TIMECAPS tc;
UINT userval,*pVal;
PyObject *mod, *attr, *pargs, *pres;
UINT SetTimerCallback( UINT wTimrID, UINT msInterval ); staticvoid CALLBACK PeriodicTimer(UINT wTimerID, UINT msg, DWORD dwUser, DWORD dw1, DWORD dw2); PyObject *ProcessTimer( void ); PyObject *pymod; PyObject *pattr; PyObject *foop; staticvoid (*pFunc)(UINT wTimerID, UINT msg, DWORD dwUser, DWORD dw1, DWORD dw2);
staticPyObject *timer_setup( PyObject *pSelf, PyObject *pArgs ) {
if (timeGetDevCaps(&tc, sizeof(TIMECAPS)) != TIMERR_NOERROR) {
}
wTimerRes = min(max(tc.wPeriodMin, TARGET_RESOLUTION), tc.wPeriodMax);
timeBeginPeriod(wTimerRes);
LEDflag = FALSE;
timeFlag = FALSE;
timerActive = FALSE;
userval = 123;
pVal = &userval;
Py_Initialize();
fooval = 0;
return Py_None;
}
staticPyObject *timer_start( PyObject *pSelf, PyObject *pArgs ) { int tval = 2000; if( timerActive ){ timeKillEvent( wTimerID ); } if( timerSet ){ tval = setVal; } UINT retval = SetTimerCallback( wTimerID, tval ); timerActive = TRUE; return Py_None; }
UINT SetTimerCallback(UINT wTimrID, // sequencer data
UINT msInterval) // event interval
{
wTimerID = timeSetEvent(
msInterval, // delay
wTimerRes, // resolution (global variable)
(LPTIMECALLBACK)PeriodicTimer, // callback function
wTimrID, // user data
TIME_PERIODIC|TIME_CALLBACK_FUNCTION );
if(!wTimerID)
{
return 999;
}
else{
return 0;
}
}
staticvoid CALLBACK PeriodicTimer(UINT wTimerID, UINT msg, DWORD dwUser, DWORD dw1, DWORD dw2) { PyGILState_STATE pgs;
pgs = PyGILState_Ensure();
if(attrsetFlag)
{
pres = PyObject_CallFunction(attr,NULL);
if( pres == NULL )printf("CallFunction failed!\n");
}
PyGILState_Release( pgs );
}
staticPyObject *timer_kill( PyObject *pSelf, PyObject *pArgs ) { timeKillEvent( wTimerID ); timerSet = FALSE; timerActive = FALSE; return Py_None; }
staticPyObject *timer_settime( PyObject *pSelf, PyObject *pArgs ) { UINT t; PyArg_ParseTuple( pArgs, "i", &t ); setVal = t; timerSet = TRUE; return Py_None; }
staticPyObject *timer_setmodname( PyObject *pSelf, PyObject *pArgs ) {
char *b;
PyArg_ParseTuple( pArgs, "s", &b );
mod = PyImport_ImportModule(b);
if( mod == NULL )
{
printf("Could not import %s\n",b);
return Py_None;
}
modsetFlag = TRUE;
return Py_None;
}
staticPyObject *timer_setprocname( PyObject *pSelf, PyObject *pArgs ) { char *b; if( !modsetFlag )return Py_None; PyArg_ParseTuple( pArgs, "s", &b ); attr = PyObject_GetAttrString(mod,b); if( attr == NULL ) { printf("Could not import %s\n",b); return Py_None; } attrsetFlag = TRUE; return Py_None; }
staticPyMethodDef fetimer_methods[] = { {"timer", timer_setup, METH_VARARGS, "blah"}, {"start", timer_start, METH_VARARGS, "start timer" }, {"kill", timer_kill, METH_VARARGS, "kill timer" }, {"settime", timer_settime, METH_VARARGS, "set timer" }, {"setmodname", timer_setmodname, METH_VARARGS, "set module name" }, {"setprocname", timer_setprocname, METH_VARARGS, "set procedure name" }, {NULL, NULL} };
PyMODINIT_FUNC initfetimer(void) { Py_InitModule("fetimer", fetimer_methods); } The Python code (Timetest3.py) that sets this up looks like this: #Time Test 3.py
import fetimer #import feserial import time
Hit = 0 L = 0
fetimer.timer()
fetimer.settime(30)
fetimer.setmodname("Timeslice3")
fetimer.setprocname("Timetester")
#feserial.open(4)
#feserial.dtr(0)
print "\n Program Waiting for Time Slice"
fetimer.start()
while True: time.sleep(0.010)
and the module Timeslice3.py looks like this:
#Timeslice3.py def Timetester(): pass
The application should run by entering "python timetest3.py" at the command prompt.
When I run this stuff, it works fine for hundreds, often even thousands, of timer ticks (I’ve been testing with about thirty ticks per second, but it doesn’t matter – it still crashes at ten or fewer ticks per second). Sometimes it runs for only a few seconds, sometimes for ten minutes or so. But it always eventually crashes Python. Usually it gives no error message. Sometimes, though, it does give an error message, but not always the same one. I’ve noted three that it has given in my testing so far:
Fatal Python Error: This thread state must be current when releasing
Fatal Python Error: PyThreadState_DeleteCurrent: no current tstate
Fatal Python Error: PyEval_SaveThread: NULL tstate
Can anybody help me make this code stable, so that it works all the time? I’m using Python 2.6.5 under Windows Vista, but it crashes under Windows XP as well.

On Mon, Jul 5, 2010 at 8:36 PM, Paul Grunau <paul@anilabsys.com> wrote:
Although I've never written a pure extension, I have just implemented an embedded & extended python, that works with multiple C threads, and I can't see anything glaringly obviously wrong in your code. What I might suggest is that you try with the debug version of python, and then when the program crashes you can take a dump, allowing you to see at least the stack trace for the different threads. Maybe that will provide some clues. I found that useful when debugging mine.
I'd also say that as you suggest that it only crashes when the timer ticks very often, it could be a re-entrant problem? The docs on timeSetEvent don't give much information, but if the one timer callback event hasn't finished by the time the next one starts, does that start a new thread, wait for the last to finish then callback again on the same thread?
Hopefully someone with more experience can spot something.
Just a suggestion.
Dave.

On 6/07/2010 4:36 AM, Paul Grunau wrote:
It isn't clear that this is your problem, but all your methods which return None fail to increment it's reference count.
Also, consider reducing your code down to the smallest possible sample which demonstrates the problem and attach the source files to the message - you may find that if you can get it down to "a few" lines of code people which people can use without trying to copy/paste then fix the line wrapping, people will actually try it rather than speculate about it.
HTH,
Mark

Mark Hammond, 19.07.2010 03:15:
Good call. Since returning None is so common, there's even a macro for that, called Py_RETURN_NONE.
A suggestion to the OP at this point: using Cython instead of C makes it easy to get the ref-counting right, as you don't have to care about it there.
Another very common thing to get wrong: if the callback comes within a completely new thread that was not created by Python, you need to set up Python's thread state first. PyGILState_Ensure() is not enough in this case. See the C-API docs.
Stefan

On Mon, Jul 5, 2010 at 8:36 PM, Paul Grunau <paul@anilabsys.com> wrote:
Although I've never written a pure extension, I have just implemented an embedded & extended python, that works with multiple C threads, and I can't see anything glaringly obviously wrong in your code. What I might suggest is that you try with the debug version of python, and then when the program crashes you can take a dump, allowing you to see at least the stack trace for the different threads. Maybe that will provide some clues. I found that useful when debugging mine.
I'd also say that as you suggest that it only crashes when the timer ticks very often, it could be a re-entrant problem? The docs on timeSetEvent don't give much information, but if the one timer callback event hasn't finished by the time the next one starts, does that start a new thread, wait for the last to finish then callback again on the same thread?
Hopefully someone with more experience can spot something.
Just a suggestion.
Dave.

On 6/07/2010 4:36 AM, Paul Grunau wrote:
It isn't clear that this is your problem, but all your methods which return None fail to increment it's reference count.
Also, consider reducing your code down to the smallest possible sample which demonstrates the problem and attach the source files to the message - you may find that if you can get it down to "a few" lines of code people which people can use without trying to copy/paste then fix the line wrapping, people will actually try it rather than speculate about it.
HTH,
Mark

Mark Hammond, 19.07.2010 03:15:
Good call. Since returning None is so common, there's even a macro for that, called Py_RETURN_NONE.
A suggestion to the OP at this point: using Cython instead of C makes it easy to get the ref-counting right, as you don't have to care about it there.
Another very common thing to get wrong: if the callback comes within a completely new thread that was not created by Python, you need to set up Python's thread state first. PyGILState_Ensure() is not enough in this case. See the C-API docs.
Stefan
participants (4)
-
Dave Brotherstone
-
Mark Hammond
-
Paul Grunau
-
Stefan Behnel