[Python-Dev] "Fixing" the new GIL

Tue Mar 16 11:37:52 CET 2010

Martin v. Löwis wrote:
> Cameron Simpson wrote:
>> The idea here is that one has a few threads receiving requests (eg a
>> daemon watching a socket or monitoring a db queue table) which then use
>> the FuncMultiQueue to manage how many actual requests are processed
>> in parallel (yes, a semaphore can cover a lot of this, but not the
>> asynchronous call modes).
> 
> Why do you say processing is in parallel? In Python, processing is
> normally never in parallel, but always sequential (potentially
> interleaving). Are you releasing the GIL for processing?

We know the GIL is timeslicing - that's the whole point. The real
question is *how often* is it time slicing and *which threads* are
getting to run.

The key issue is that the current new GIL implementation means that, in
the presence of a CPU bound GIL-holding thread, the *minimum* duration
of *any* C call that releases the GIL becomes sys.getcheckinterval().
The CPU bound thread *always* wants the GIL and once it gets it, it
won't give it back until sys.getcheckinterval() expires. This is in
contrast to the I/O bound (or non-GIL holding CPU bound) threads that
nearly always yield the GIL early, since they're waiting for something
else to happen (be it I/O or a native calculation).

Antoine's interactiveness patch resolves this by rewarding threads that
regularly release the GIL early: the interpreter quickly recognises them
as doing so and they are subsequently given priority over the threads
that only release the GIL when the check interval expires. When threads
that are flagged as interactive ask for the GIL back, the interpreter
gives it to them immediately instead of forcing them to wait until the
check interval expires (because it trusts those threads not to hang onto
the GIL for too long).

I like this significantly better than the explicit priority patch,
because it means threads are given priority based on their behaviour
(i.e. frequently holding the GIL for less than the check interval)
rather than relying on developers to correctly decide between using a
normal GIL request and a priority request.

Handling of thread pools that mix I/O bound and CPU bound GIL-holding
tasks isn't quite optimal with this approach (since the nature of any
given thread will change over time), but it isn't terrible either - the
interpreter doesn't need much time to reclassify a thread as interactive
or non-interactive (and there are a couple of parameters that can be
tuned to control how quickly these changes in status can happen).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------