[Python-Dev] Removing the GIL (Me, not you!)
tulloss2 at uiuc.edu
Thu Sep 13 09:08:35 CEST 2007
> What do you think?
I'm going to have to agree with Martin here, although I'm not sure I
understand what you're saying entirely. Perhaps if you explained where the
benefits of this approach come from, it would clear up what you're thinking.
After a few days of thought, I'm starting to realize the difficulty of
maintaining compatibility with existing C extensions after removing the GIL.
The possible C-level side effects are very difficult to work around without
kernel or hardware level transaction support. I see a couple of approaches
that might work (though I probably haven't thought of everything).
1. Use message passing and transactions.
Put every module into its own tasklet that ends up getting owned by one
thread or another. Every call to an object that is owned by that module is
put into a module wide message queue and delivered sequentially to its
objects. All this does is serialize requests to objects implemented in C to
slightly mitigate the need to lock. Then use transactions to protect any
python object. You still have the problem of C side effects going unnoticed
(IE Thread A executes function, object sets c-state in a certain way, Thread
B calls the same function, changes all the C-state, A reacts to return value
that no longer reflects on the actual state). So, this doesn't actually
work, but its close since python objects will remain consistent
w/transactions and conflicting C-code won't execute simultaneously.
2. Do it perl style.
Perl just spawns off multiple interpreters and doesn't share state between
them. This would require cleaning up what state belongs where, and probably
providing some global state lock free. For instance, all the numbers,
letters, and None are read only, so we could probably work out a way to
share them between threads. In fact, any python global could be read only
until it is written to. Then it belongs to the thread that wrote to it and
is updated in the other threads via some sort of cache-coherency protocol. I
haven't really wrapped my head around how C extensions would play with this
yet, but essentially code operating in different threads would be operating
on different copies of the modules. That seems fair to me.
3. Come up with an elegant way of handling multiple python processes. Of
course, this has some downsides. I don't really want to pickle python
objects around if I decide they need to be in another address space, which I
would probably occasionally need to do if I abstracted away the fact that a
bunch of interpreters had been spawned off.
4. Remove the GIL, use transactions for python objects, and adapt all
C-extensions to be thread safe. Woo.
I'll keep kicking around ideas for a while; hopefully they'll become more
refined as I explore the code more.
PS. A good paper on how hardware transactional memory could help us out:
A few of you have probably read this already. Martin is even acknowledged,
but it was news to me!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev