[Python-Dev] Extension modules, Threading, and the GIL

David Abrahams dave@boost-consulting.com
Tue, 31 Dec 2002 07:24:31 -0500


martin@v.loewis.de (Martin v. L=F6wis) writes:

> David Abrahams <dave@boost-consulting.com> writes:
>
>> >> The symptom is that Python complains at some point that there's no
>> >> thread state.  It goes away if A releases the GIL before calling
>> >> into Qt, and reacquires the GIL afterwards.
> [...]
>> No, I am not saying A releases the GIL.=20=20
>
> "...there is no thread state. It [the thread state] goes away if A
> releases the GIL ..."
>
> From that I inferred that A releases the GIL, since you said that
> there is no thread state. Rereading your message, I now see that you
> meant "It [the problem] goes away".

Right.

> So I now understand that you reported that there is no deadlock, and
> that A does not release the GIL, and that Python reports that there is
> no thread state "when A returns to Python". You also report that B
> acquires the GIL.
>
> I can't understand why this happens. How does B acquire the GIL?
>
> Assuming that B uses PyEval_AcquireThread/PyEval_ReleaseThread, I
> would expect that=20
> a) there is a deadlock if this happens in a context of a call to
>    A, since the GIL is already held, and (if, for some reason,
>    locks are recursive on this platform),=20
> b) the code
>
> 	if (PyThreadState_Swap(tstate) !=3D NULL)
> 		Py_FatalError(
> 			"PyEval_AcquireThread: non-NULL old thread state");
>
>    should trigger, as there is an old thread state.
>
> So I infer that B does not use
> PyEval_AcquireThread/PyEval_ReleaseThread. What else does it use?


Looking at the SIP sources, it appears to be using
PyEval_SaveThread/PyEval_RestoreThread, but I'd have to ask Phil to
weigh in on this one to know for sure.

Here's a stack backtrace reported by my user.  You can ignore the
oddness of frame #4; the SIP author is patching Python's instance
method table, but has convinced me that what he's doing is harmless
(it's still evil, of course <wink>).

  Program received signal SIGSEGV, Segmentation fault.
  [Switching to Thread 1024 (LWP 5948)]
  PyErr_SetObject (exception=3D0x8108a8c, value=3D0x81ab208) at Python/erro=
rs.c:39
  (gdb) bt
  #0  PyErr_SetObject (exception=3D0x8108a8c, value=3D0x81ab208)
      at Python/errors.c:39
  #1  0x08087ac7 in PyErr_Format (exception=3D0x8108a8c,=20
      format=3D0x80df620 "%.50s instance has no attribute '%.400s'")
      at Python/errors.c:408
  #2  0x080b0467 in instance_getattr1 (inst=3D0x82c5654, name=3D0x8154558)
      at Objects/classobject.c:678
  #3  0x080b3e35 in instance_getattr (inst=3D0x82c5654, name=3D0x8154558)
      at Objects/classobject.c:715
  #4  0x40cd2a43 in instanceGetAttr ()
     from /usr/local/lib/python2.2/site-packages/libsip.so
  #5  0x08056794 in PyObject_GetAttr (v=3D0x82c5654, name=3D0x8154558)
      at Objects/object.c:1108
  #6  0x0807705e in eval_frame (f=3D0x811a974) at Python/ceval.c:1784
  #7  0x0807866e in PyEval_EvalCodeEx (co=3D0x8161de0, globals=3D0x81139b4,=
=20
      locals=3D0x81139b4, args=3D0x0, argcount=3D0, kws=3D0x0, kwcount=3D0,=
 defs=3D0x0,=20
      defcount=3D0, closure=3D0x0) at Python/ceval.c:2595
  #8  0x0807a700 in PyEval_EvalCode (co=3D0x8161de0, globals=3D0x81139b4,=20
      locals=3D0x81139b4) at Python/ceval.c:481
  #9  0x080950b1 in run_node (n=3D0x81263b8,=20
      filename=3D0xbffffca2 "/home/pfkeb/hippodraw-BUILD/testsuite/dclock.p=
y",=20
      globals=3D0x81139b4, locals=3D0x81139b4, flags=3D0xbffffac4)
      at Python/pythonrun.c:1079
  #10 0x08095062 in run_err_node (n=3D0x81263b8,=20
      filename=3D0xbffffca2 "/home/pfkeb/hippodraw-BUILD/testsuite/dclock.p=
y",=20
      globals=3D0x81139b4, locals=3D0x81139b4, flags=3D0xbffffac4)
      at Python/pythonrun.c:1066
  #11 0x08094ccb in PyRun_FileExFlags (fp=3D0x8104038,=20
      filename=3D0xbffffca2 "/home/pfkeb/hippodraw-BUILD/testsuite/dclock.p=
y",=20
      start=3D257, globals=3D0x81139b4, locals=3D0x81139b4, closeit=3D1,=20
      flags=3D0xbffffac4) at Python/pythonrun.c:1057
  #12 0x080938b1 in PyRun_SimpleFileExFlags (fp=3D0x8104038,=20
      filename=3D0xbffffca2 "/home/pfkeb/hippodraw-BUILD/testsuite/dclock.p=
y",=20
      closeit=3D1, flags=3D0xbffffac4) at Python/pythonrun.c:685
  #13 0x0809481f in PyRun_AnyFileExFlags (fp=3D0x8104038,=20
      filename=3D0xbffffca2 "/home/pfkeb/hippodraw-BUILD/testsuite/dclock.p=
y",=20
      closeit=3D1, flags=3D0xbffffac4) at Python/pythonrun.c:495
  #14 0x08053632 in Py_Main (argc=3D2, argv=3D0xbffffb54) at Modules/main.c=
:364
  #15 0x08052ee6 in main (argc=3D2, argv=3D0xbffffb54) at Modules/python.c:=
10
  #16 0x40088627 in __libc_start_main (main=3D0x8052ed0 <main>, argc=3D2,=20
      ubp_av=3D0xbffffb54, init=3D0x80522d4 <_init>, fini=3D0x80cf610 <_fin=
i>,=20
      rtld_fini=3D0x4000dcd4 <_dl_fini>, stack_end=3D0xbffffb4c)
      at ../sysdeps/generic/libc-start.c:129
  (gdb)=20

     On the line of the error

          oldtype =3D tstate->curexc_type;

  (gdb) p tstate
  $1 =3D (PyThreadState *) 0x0
  (gdb)=20

>> > If there was a thread state when it was called, there should be a
>> > thread state when it returns.
>>=20
>> Yes, the whole problem is that there's no way to know whether there's
>> a thread state.
>
> Wrong. If B acquires the GIL, B must use some thread state to do
> so. It must install that thread state through PyThreadState_Swap,
> directly or indirectly. That will return the old thread state, or
> NULL.

Let me rephrase: the whole problem is that there's no way to know if
you have the interpreter lock.  You can't call PyThreadState_Swap to
find out if there's a thread state if you don't have the interpreter
lock.  You can't acquire the lock if you already have it.

>> > If so, a mutex-protected record might work, but also might be
>> > expensive.
>>=20
>> Yes.  I assume that acquiring the GIL already needs to do
>> synchronization, though.
>
> Sure. But with that proposed change, you have not only the GIL lock
> call (which is a single sem_wait call on Posix, and an
> InterlockedCompareExchange call on Win32). You also get a mutex
> call, and a call to find out the current thread.

There you go, it's a harder problem than I thought ;-)

--=20
                       David Abrahams
   dave@boost-consulting.com * http://www.boost-consulting.com
Boost support, enhancements, training, and commercial distribution