[Python-Dev] Fwd: Removal of GIL through refcounting removal.

Josiah Carlson josiah.carlson at gmail.com
Tue Nov 4 03:11:43 CET 2008

On Mon, Nov 3, 2008 at 10:59 AM, Curt Hagenlocher <curt at hagenlocher.org>wrote:

> On Mon, Nov 3, 2008 at 11:50 AM, Josiah Carlson <josiah.carlson at gmail.com>wrote:
>> On Sun, Nov 2, 2008 at 3:51 PM, <skip at pobox.com> wrote:
>>>    Antoine> I think it is important to remind that the GIL doesn't
>>> prevent
>>>    Antoine> (almost) true multithreading. The only thing it prevents is
>>>    Antoine> full use of multi-CPU resources in a single process.
>>> I believe everyone here knows that.  I believe what most people are
>>> clamoring for is to make "full use of their multi-CPU resources in a
>>> single
>>> process".
>> Which is, arguably, silly.  As we've seen in the last 2 months with
>> Chrome, multiple processes for a single "program" is actually a pretty good
>> idea.  With the multiprocessing module in the standard library offering a
>> threading-like interface, people no longer have any excuses for not fully
>> exploiting their multiple cores in Python.
> There is no shortage of algorithms (such as matrix multiplication) that are
> parallelizable but not particularly good candidates for an IPC-based
> multiprocessing paradigm.

Ahh, but those algorithms aren't going to be written in Python; they are
going to be written in C, and are going to manipulate memory directly.  With
such things, you can use standard Python threads, call into your C runtime,
and release the GIL.  Alternatively, you can use the mmap module to store
your data, shared across multiple processes, using the same direct-memory
access as you would with multiple threads and GIL release.

Also, most local-only communications primitives (named pipes, anonymous
pipes, unix domain sockets, ...) use zero/one copy implementations, so as
long as your RPC isn't slow, you can do pretty well even on the Python side
(especially if you pre-allocate your receive buffer, and fill in the data as
you receive it; this is pretty much what mutablebytes was created for, now
we just need a proper memoryview).

So again, I claim that not using multiple processes for your multi-core
machine in order to use your multiple cores to their full extent is silly.
 As an aside, Python array.array() instances have a char* and length, so if
you are careful, you can create an array.array object from an mmap pointer,
and get fairly decent performance even in Python with shared memory.
 Someone should probably look into allowing array.array() to take a
read/readwrite memoryview as an argument to support such things, as well as
allowing mmaps to be passed via multiprocessing (if they aren't already

 - Josiah
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20081103/ce22a0d8/attachment.htm>

More information about the Python-Dev mailing list