[Python-Dev] baby steps for free-threading

Greg Stein gstein@lyra.org
Mon, 17 Apr 2000 01:14:51 -0700 (PDT)

A couple months ago, I exchanged a few emails with Guido about doing the
free-threading work. In particular, for the 1.6 release. At that point
(and now), I said that I wouldn't be starting on it until this summer,
which means it would miss the 1.6 release. However, there are some items
that could go into 1.6 *today* that would make it easier down the road to
add free-threading to Python. I said that I'd post those in the hope that
somebody might want to look at developing the necessary patches. It fell
off my plate, so I'm getting back to that now...

Python needs a number of basic things to support free threading. None of
these should impact its performance or reliability. For the most part,
they just provide a platform for the later addition.

1) Create a portable abstraction for using the platform's per-thread state
   mechanism. On Win32, this is TLS. On pthreads, this is pthread_key_*.

   This mechanism will be used to store PyThreadState structure pointers,
   rather than _PyThreadState_Current. The latter variable must go away.

   Rationale: two threads will be operating simultaneously. An inherent
   conflict arises if _PyThreadState_Current is used. The TLS-like
   mechanism is used by the threads to look up "their" state.

   There will be a ripple effect on PyThreadState_Swap(); dunno offhand
   what. It may become empty.

2) Python needs a lightweight, short-duration, internally-used critical
   section type. The current lock type is used at the Python level and
   internally. For internal operations, it is rather heavyweight, has
   unnecessary semantics, and is slower than a plain crit section.

   Specifically, I'm looking at Win32's CRITICAL_SECTION and pthread's
   mutex type. A spinlock mechanism would be coolness.

   Rationale: Python needs critical sections to protect data from being
   trashed by multiple, simultaneous access. These crit sections need to
   be as fast as possible since they'll execute at all key points where
   data is manipulated.

3) Python needs an atomic increment/decrement (internal) operation.

   Rationale: these are used in INCREF/DECREF to correctly increment or
   decrement the refcount in the face of multiple threads trying to do

   Win32: InterlockedIncrement/Decrement. pthreads would use the
   lightweight crit section above (on every INC/DEC!!). Some other
   platforms may have specific capabilities to keep this fast. Note that
   platforms (outside of their threading libraries) may have functions to
   do this.

4) Python's configuration system needs to be updated to include a
   --with-free-thread option since this will not be enabled by default.
   Related changes to acconfig.h would be needed. Compiling in the above
   pieces based on the flag would be nice (although Python could switch to
   the crit section in some cases where it uses the heavy lock today)

   Rationale: duh

5) An analysis of Python's globals needs to be performed. Any global that
   can safely be made "const" should. If a global is write-once (such as
   classobject.c::getattrstr), then these are marginally okay (there is a 
   race condition, with an acceptable outcome, but a mem leak occurs).
   Personally, I would prefer a general mechanism in Python for creating
   "constants" which can be tracked by the runtime and freed.

   I would also like to see a generalized "object pool" mechanism be built
   and used for tuples, ints, floats, frames, etc.

   Rationale: any globals which are mutable must be made thread-safe. The
   fewer non-const globals to examine, the fewer to analyze for race
   conditions and thread-safety requirements.

   Note: making some globals "const" has a ripple effect through Python.
   This is sometimes known as "const poisoning". Guido has stated an
   acceptance to adding "const" throughout the interpreter, but would
   prefer a complete (rather than ripple-based, partial) overhaul.

I think that is all for now. Achieving these five steps within the 1.6
timeframe means that the free-threading patches will be *much* smaller. It
also creates much more visibility and testing for these sections.

Post 1.6, a patch set to add critical sections to lists and dicts would be
built. In addition, a new analysis would be done to examine the globals
that are available along with possible race conditions in other mutable
types and structures. Not all structures will be made thread-safe; for
example, frame objects are used by a single thread at a time (I'm sure
somebody could find a way to have multiple threads use or look at them,
but that person can take a leap, too :-)

Depending upon Guido's desire, the various schedules, and how well the
development goes, Python 1.6.1 could incorporate the free-threading option
in the base distribution.


Greg Stein, http://www.lyra.org/