[Python-ideas] solving multi-core Python

Tue Jun 23 20:29:40 CEST 2015

On Sun, Jun 21, 2015 at 12:40:43PM +0200, Stefan Behnel wrote:
> Nick Coghlan schrieb am 21.06.2015 um 12:25:
> > On 21 June 2015 at 19:48, Antoine Pitrou wrote:
> >> On Sun, 21 Jun 2015 16:31:33 +1000 Nick Coghlan wrote:
> >>>
> >>> For inter-interpreter communication, the worst case scenario is having
> >>> to rely on a memcpy based message passing system (which would still be
> >>> faster than multiprocessing's serialisation + IPC overhead)
> >>
> >> And memcpy() updates pointer references to dependent objects magically?
> >> Surely you meant the memdeepcopy() function that's part of every
> >> standard C library!
> > 
> > We already have the tools to do deep copies of object trees (although
> > I'll concede I *was* actually thinking in terms of the classic C/C++
> > mistake of carelessly copying pointers around when I wrote that
> > particular message). One of the options for deep copies tends to be a
> > pickle/unpickle round trip, which will still incur the serialisation
> > overhead, but not the IPC overhead.
> > 
> > "Faster message passing than multiprocessing" sets the baseline pretty
> > low, after all.
> > 
> > However, this is also why Eric mentions the notions of object
> > ownership or limiting channels to less than the full complement of
> > Python objects. As an *added* feature at the Python level, it's
> > possible to initially enforce restrictions that don't exist in the C
> > level subinterpeter API, and then work to relax those restrictions
> > over time.
> 
> If objects can make it explicit that they support sharing (and preferably
> are allowed to implement the exact details themselves), I'm sure we'll find
> ways to share NumPy arrays across subinterpreters. That feature alone tends
> to be a quick way to make a lot of people happy.

    FWIW, the following commit was all it took to get NumPy playing
    nicely with PyParallel:

        https://github.com/pyparallel/numpy/commit/046311ac1d66cec789fa8fd79b1b582a3dea26a8

    It uses thread-local buckets instead of static ones, and calls out
    to PyMem_Raw(Malloc|Realloc|Calloc|Free) instead of the normal libc
    counterparts.  This means PyParallel will intercept the call within
    a parallel context and divert it to the per-context heap.

    Example parallel callback using NumPy:

        https://bitbucket.org/tpn/pyparallel/src/8528b11ba51003a9821ceb75683ee96ed33db28a/examples/wiki/wiki.py?at=3.3-px#cl-285

    (Also, datrie is a Cython module, and that seems to work fine as
     well, which is neat, as it means you could sub out the entire
     Python callback with a Cythonized version, including all the
     relatively-slow-compared-to-C http header parsing that happens in
     async.http.server.)

        Trent.