[Python-3000] Kill GIL?

Ivan Krstić krstic at solarsail.hcs.harvard.edu
Mon Sep 18 07:55:59 CEST 2006


Andre Meyer wrote:
> As a heavy user of multi-threading in Python and following the current
> discussions about Python on multi-processor systems on the python-list I
> wonder what the plans are for improving MP performance in Py3k. 

I have four aborted e-mails in my 'Drafts' folder that are asking the
same question; each time, I decided that the almost inevitably ensuing
"threads suck!" flamewar just isn't worth it. Now that someone else has
taken the plunge...

At present, the Python approach to multi-processing sounds a bit like
"let's stick our collective hands in the sand and pretend there's no
problem". In particular, one oft-parroted argument says that it's not
worth changing or optimizing the language for the few people who can
afford SMP hardware. In the meantime, dual-core laptops are becoming the
standard, with Intel predicting quad-core will become mainstream in the
next few years, and the number of server orders for single-core, UP
machines is plummeting.

>From this, it's obvious to me that we need to do *something* to
introduce stronger multi-processing support. Our current abilities are
rather bad: we offer no microthreads, which is making elegant
concurrency primitives such as Erlang's, ported to Python by the
Candygram project [0], unnecessarily expensive. Instead, we only offer
heavy threads that each allocate a full-size stack, and there's no
actual ability to parallelize thread execution across CPUs. There's also
no way to simply fork and coordinate between the forked processes,
depending on the nature of the problem being solved, since there's no
shared memory primitive in the stdlib (this because shared memory
semantics are notoriously different across platforms). On top of it all,
any adopted solution needs to be implementable across all the major
Python interpreters, which makes finding a solution that much harder.

The way I see it, we have several options:

* Bite the bullet; write and support a stdlib SHM primitive that works
wherever possible, and simply doesn't work on completely broken
platforms (I understand Windows falls into this category). Utilize it in
a lightweight fork-and-coordinate wrapper provided in the stdlib.

* Bite the mortar shell, and remove the GIL.

* Introduce microthreads, declare that Python endorses Erlang's
no-sharing approach to concurrency, and incorporate something like
candygram into the stdlib.

* Introduce a fork-and-coordinate wrapper in the stdlib, and declare
that we're simply not going to support the use case that requires
sharing (as opposed to merely passing) objects between processes.

The first option is a Pareto optimization, but having stdlib
functionality flat out unavailable on some platforms might be out of the
question. It'd be good to hear Guido's longer-term view on concurrency
in Python. That discussion might be more appropriate on python-dev, though.

Cheers,


[0] http://candygram.sourceforge.net/

-- 
Ivan Krstić <krstic at solarsail.hcs.harvard.edu> | GPG: 0x147C722D


More information about the Python-3000 mailing list