[concurrency] Inside the Python GIL

Adam Sampson ats at offog.org
Sat Jun 13 01:44:42 CEST 2009


Jeffrey Yasskin <jyasskin at gmail.com> writes:

> I think we could solve this by checking the elapsed time on each
> "check" rather than unconditionally switching threads,

This is a reasonable strategy, and one that's used elsewhere. One
problem is that reading the clock can be quite expensive, depending on
the OS and hardware -- although you may be able to get away with a
cheap-but-inaccurate clock (e.g. the IA32 TSC). This is a fairly common
problem for lightweight threading frameworks, so it's probably worth
looking at how (for example) the GHC runtime solves it.

I think your suggestion that each thread should have its own signal
is entirely sensible. It'd be less efficient than a proper lightweight
threading implementation (which is hard to do portably), but it'd at
least avoid the expensive contention on the lock in the current
approach, allow smarter scheduling for handling I/O and signals, and
make it easier to reason about Python's scheduling behaviour.

Managing the queues of processes to be run can be done in a wait-free
way, if atomics are available, so locking can go away entirely. It's
worth noting that a thread can decide to reschedule for two reasons:
either because it's hit one of the periodic checks, in which case it can
be woken up again immediately and can thus just wait, or because it
wants to go away and do something that doesn't involve the Python
runtime state, in which case it'll need to explicitly requeue itself
before waiting. This also gives you a fairly crude way to distinguish
CPU-bound (the first case) and I/O-bound (the second case) threads; in
the first case, you might want to prefer immediately rescheduling the
same thread most of the time if there are no higher-priority threads
waiting, to reduce cache thrashing.

Somewhat apropos of this, I did an extremely grotty hack a while ago to
build a Python threading implementation on top of CCSP, our multicore
lightweight threads runtime:

  Patch: http://offog.org/stuff/python-2.5-ccsp-v1.diff
  CCSP is part of KRoC: http://projects.cs.kent.ac.uk/projects/kroc/

(It's grotty because CCSP's lock semantics don't match Python's, and
because there's no way of getting a "thread identifier" from CCSP
directly. Both would be fixable with changes to CCSP.)

The benchmark results for pure Python programs were largely unimpressive
(as I expected), since the cheap communication was swamped by the amount
of time spent claiming and releasing the GIL...

-- 
Adam Sampson <ats at offog.org>                         <http://offog.org/>


More information about the concurrency-sig mailing list