[Python-ideas] solving multi-core Python

Sun Jun 21 23:24:19 CEST 2015

On Jun 21, 2015, at 06:09, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> Avoiding object serialisation is indeed the main objective. With
> subinterpreters, we have a lot more options for that than we do with
> any form of IPC, including shared references to immutable objects, and
> the PEP 3118 buffer API.

It seems like you could provide a way to efficiently copy and share deeper objects than integers and buffers without sharing everything, assuming the user code knows, at the time those objects are created, that they will be copied or shared. Basically, you allocate the objects into a separate arena (along with allocating their refcounts on a separate page, as already mentioned). You can't add a reference to an outside object in an arena-allocated object, although you can copy that outside object into the arena. And then you just pass or clone (possibly by using CoW memory-mapping calls, only falling back to memcpy on platforms that can't do that) entire arenas instead of individual objects (so you don't need the fictitious memdeepcpy function that someone ridiculed earlier in this thread, but you get 90% of the benefits of having one).

This has the same basic advantages of forking, but it's doable efficiently on Windows, and doable less efficiently (but still better than spawn and pass) on even weird embedded platforms, and it forces code to be explicit about what gets shared and copied without forcing it to work through less-natural queue-like APIs.

Also, it seems like you could fake this entire arena API on top of pickle/copy for a first implementation, then just replace the underlying implementation separately.