[Python-Dev] Updated AutoThreadState pre-PEP

Mark Hammond mhammond@skippinet.com.au
Fri, 14 Feb 2003 09:34:37 +1100


This is a multi-part message in MIME format.

------=_NextPart_000_0058_01C2D40C.4B589DB0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

Hi all,
  I have updated my pre-PEP, and am about to start the process of making it
a real PEP :)

I have attached the latest and greatest version here.  I believe it largely
complete.  There is a fully functional, platform independent implementation
at http://www.python.org/sf/684256

All comments, reviews, tests and anything else welcome.

Mark.

------=_NextPart_000_0058_01C2D40C.4B589DB0
Content-Type: text/plain;
	name="pep_gil.txt"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="pep_gil.txt"

PEP: xxx
Title: Simplified Global Interpreter Lock acquisition for extensions
Version: $Revision: $
Last-Modified: $Date: $
Author: Mark Hammond <mhammond@skippinet.com.au>
Status:=20
Type:=20
Content-Type: text/plain
Created: Feb-2003
Post-History:

Open Issues
    This is where I note comments from people that are yet to be =
resolved.
    - JustvR prefers a PyGIL prefix over PyAutoThreadState.  MarkH =
prefers
      the latter as the implementation is really about thread states
      as much as the GIL, but doesn't really care though.
    - Should we provide Py_AUTO_THREAD_STATE macros?
    - Is my "Limitation" regarding PyEval_InitThreads() OK?

Abstract

    This PEP proposes a simplified API for access to the Global
    Interpreter Lock (GIL) for Python extension modules.
    Specifically, it provides a solution for authors of complex
    multi-threaded extensions, where the current state of Python
    (i.e., the state of the GIL, or if Python is currently using the
    GIL, or indeed if Python has been initialized) is unknown.

    This PEP proposes a new API, for platforms built with threading
    support, to manage the Python thread state.  An implementation
    strategy is proposed, along with an initial, platform independent
    implementation.

Rationale

    The current Python interpreter state API is suitable for simple,
    single-threaded extensions, but quickly becomes incredibly complex
    for non-trivial, multi-threaded extensions.

    Currently Python provides two mechanisms for dealing with the GIL:

    - Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros.
      These macros are provided primarily to allow a simple Python
      extension that already owns the GIL to temporarily release it
      while making an "external" (ie, non-Python), generally
      expensive, call.  Any existing Python threads that are blocked
      waiting for the GIL are then free to run.  While this is fine
      for extensions making calls from Python into the outside world,
      it is no help for extensions that need to make calls into Python
      when the thread state is unknown.

    - PyThreadState and PyInterpreterState APIs.
      These API functions allow an extension/embedded application to
      acquire the GIL, but suffer from a serious boot-strapping
      problem - they require you to know the state of the Python
      interpreter and of the GIL before they can be used.  One
      particular problem is for extension authors that need to deal
      with threads never before seen by Python, but need to call
      Python from this thread.  It is very difficult, delicate and
      error prone to author an extension where these "new" threads
      always know the exact state of the GIL, and therefore can
      reliably interact with this API.

    For these reasons, the question of how such extensions should
    interact with Python is quickly becoming a FAQ.  The main impetus
    for this PEP, a thread on python-dev [1], immediately identified
    the following projects with this exact issue:

    - The win32all extensions
    - Boost
    - ctypes
    - Python-GTK bindings
    - Uno
    - PyObjC
    - Mac toolbox
    - PyXPCOM

    Currently, there is no reasonable, portable solution to this
    problem, forcing each extension author to implement their own
    hand-rolled version.  Further, the problem is complex, meaning
    many implementations are likely to be incorrect, leading to a
    variety of problems that will often manifest simply as "Python has
    hung".
   =20
    While the biggest problem in the existing thread-state API is the
    lack of the ability to query the current state of the lock, it is
    felt that a more complete, simplified solution should be offered
    to extension authors.  Such a solution should encourage authors to
    provide error-free, complex extension modules that take full
    advantage of Python's threading mechanisms.
   =20
Limitations and Exclusions

    This proposal identifies a solution for extension authors with
    complex multi-threaded requirements, but that only require a
    single "PyInterpreterState".  There is no attempt to cater for
    extensions that require multiple interpreter states.  As at time
    of writing, no extension has been identified that requires
    multiple PyInterpreterStates, and indeed it is not clear if that
    facility works correctly in Python itself.

    While this auto-thread-state API will initialize Python itself, it
    will not automatically call PyEval_InitThreads().  Thus,
    multi-threaded applications must ensure this call is made
    manually.  As the thread which makes this call is nominated the
    "main thread" by Python, it is felt that the extension author
    should make this call explicitly, to ensure the main thread is
    determinate.

    It is intended that this API be all that is necessary to acquire
    the Python GIL.  Apart from the existing, standard
    Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros, it is
    assumed that no additional thread state API functions will be used
    by the extension.  Extensions with such complicated requirements
    are free to continue to use the existing thread state API.

Proposal

    This proposal recommends a new API be added to Python to simplify
    the management of the GIL.  This API will be available on all
    platforms built with WITH_THREAD defined.
   =20
    The intent is that an extension author be able to use a small,
    well-defined "prologue dance", at any time and on any thread, and
    this dance will ensure Python is ready to be used on that thread.
    After the extension has finished with Python, it must also perform
    an "epilogue dance" to release any resources previously acquired.
    Ideally, these dances will be able to be expressed in a single
    line.

    Specifically, the following new APIs are proposed:
   =20
    /* Ensure that the current thread is ready to call the Python C
    API, regardless of the current state of Python, or of its thread
    lock.  This may be called as many times as desired by a thread, so
    long as each call is matched with a call to
    PyAutoThreadState_Release()
   =20
    The return value is an opaque "handle" to the thread state when
    PyAutoThreadState_Acquire() was called, and must be passed to
    PyAutoThreadState_Release() to ensure Python is left in the same
    state.
   =20
    When the function returns, the current thread will hold the GIL.
    Thus, the GIL is held by the thread until
    PyAutoThreadState_Release() is called. (Note that as happens now
    in Python, calling a Python API function may indeed cause a
    thread-switch and therefore a GIL ownership change.  However,
    Python guarantees that when the API function returns, the GIL will
    again be owned by the thread making the call)

    Failure is a fatal error.
    */
    PyAutoThreadState_State PyAutoThreadState_Ensure(void);

    /* Release any resources previously acquired.  After this call,
    Python's state will be the same as it was prior to the
    corresponding PyAutoThreadState_Acquire call (but generally this
    state will be unknown to the caller, hence the use of the
    AutoThreadState API.)
   =20
    Every call to PyAutoThreadState_Ensure must be matched by a
    call to PyAutoThreadState_Release on the same thread.
    */
    void PyAutoThreadState_Release(PyAutoThreadState_State state);

    Common usage will be:

    void SomeCFunction(void)
    {
        /* ensure we hold the lock */
        PyAutoThreadState_State state =3D PyAutoThreadState_Ensure();
        /* Use the Python API */
        ...
        /* Restore the state of Python */
        PyAutoThreadState_Release(state);
    }

Design and Implementation

    The general operation of PyAutoThreadState_Ensure() will be:
    - Ensure Python is initialized.
    - Get a PyThreadState for the current thread, creating and saving if =

      necessary.
    - remember the current state of the lock (owned/not owned)
    - If the current state does not own the GIL, acquire it.
    - Increment a counter for how many calls to PyAutoThreadState_Ensure
      have been made on the current thread.
    - return

    The general operation of PyAutoThreadState_Release() will be:
    - assert our thread currently holds the lock.
    - If old state indicates lock as previously unlocked, release GIL.
    - Decrement the PyAutoThreadState_Ensure counter for the thread.
    - If counter =3D=3D 0:
      - release the PyThreadState.
      - forget the ThreadState as being owned by the thread.
    - return

    It is assumed that it is an error if two discrete PyThreadStates
    are used for a single thread.  Comments in pystate.h ("State
    unique per thread") support this view, although it is never
    directly stated.  Thus, this will require some implementation of
    Thread Local Storage.  Fortunately, a platform independent
    implementation of Thread Local Storage already exists in the
    Python source tree, in the SGI threading port.  This code will be
    integrated into the platform independent Python core, but in such
    a way that platforms can provide a more optimal implementation if
    desired.

Implementation
    An implementation of this proposal can be found at=20
    http://www.python.org/sf/684256

References

    [1] =
http://mail.python.org/pipermail/python-dev/2002-December/031424.html

Copyright

    This document has been placed in the public domain.


=0C
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

------=_NextPart_000_0058_01C2D40C.4B589DB0--