
On 21 June 2015 at 15:25, Nathaniel Smith <njs@pobox.com> wrote:
On Jun 20, 2015 3:54 PM, "Eric Snow" <ericsnowcurrently@gmail.com> wrote:
On Jun 20, 2015 4:08 PM, "Nathaniel Smith" <njs@pobox.com> wrote:
On Jun 20, 2015 2:42 PM, "Eric Snow" <ericsnowcurrently@gmail.com> wrote:
tl;dr Let's exploit multiple cores by fixing up subinterpreters, exposing them in Python, and adding a mechanism to safely share objects between them.
This all sounds really cool if you can pull it off, and shared-nothing threads do seem like the least impossible model to pull off.
Agreed.
But "least impossible" and "possible" are different :-). From your email I can't tell whether this plan is viable while preserving backcompat and memory safety.
I agree that those issues must be clearly solved in the proposal before it can be approved. I'm confident the approach I'm pursuing will afford us the necessary guarantees. I'll address those specific points directly when I can sit down and organize my thoughts.
I'd love to see just a hand wavy, verbal proof-of-concept walking through how this might work in some simple but realistic case. To me a single compelling example could make this proposal feel much more concrete and achievable.
I was one of the folks pushing Eric in this direction, and that's because it's a possibility that was conceived of a few years back, but never tried due to lack of time (and inclination for those of us that are using Python primarily as an orchestration tool and hence spend most of our time on IO bound problems rather than CPU bound ones): http://www.curiousefficiency.org/posts/2012/07/volunteer-supported-free-thre... As mentioned there, I've at least spent some time with Graham Dumpleton over the past few years figuring out (and occasionally trying to address) some of the limitations of mod_wsgi's existing subinterpreter based WSGI app separation: https://code.google.com/p/modwsgi/wiki/ProcessesAndThreading#Python_Sub_Inte... The fact that mod_wsgi can run most Python web applications in a subinterpreter quite happily means we already know the core mechanism works fine, and there don't appear to be any insurmountable technical hurdles between the status quo and getting to a point where we can either switch the GIL to a read/write lock where a write lock is only needed for inter-interpreter communications, or else find a way for subinterpreters to release the GIL entirely by restricting them appropriately. For inter-interpreter communication, the worst case scenario is having to rely on a memcpy based message passing system (which would still be faster than multiprocessing's serialisation + IPC overhead), but there don't appear to be any insurmountable barriers to setting up an object ownership based system instead (code that accesses PyObject_HEAD fields directly rather than through the relevant macros and functions seems to be the most likely culprit for breaking, but I think "don't do that" is a reasonable answer there). There's plenty of prior art here (including a system I once wrote in C myself atop TI's DSP/BIOS MBX and TSK APIs), so I'm comfortable with Eric's "simple matter of engineering" characterisation of the problem space. The main reason that subinterpreters have never had a Python API before is that they have enough rough edges that having to write a custom C extension module to access the API is the least of your problems if you decide you need them. At the same time, not having a Python API not only makes them much harder to test, which means various aspects of their operation are more likely to be broken, but also makes them inherently CPython specific. Eric's proposal essentially amounts to three things: 1. Filing off enough of the rough edges of the subinterpreter support that we're comfortable giving them a public Python level API that other interpreter implementations can reasonably support 2. Providing the primitives needed for safe and efficient message passing between subinterpreters 3. Allowing subinterpreters to truly execute in parallel on multicore machines All 3 of those are useful enhancements in their own right, which offers the prospect of being able to make incremental progress towards the ultimate goal of native Python level support for distributing across multiple cores within a single process. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia