Re: [Python-ideas] solving multi-core Python

June 24, 2015

      On 24.06.2015 18:58, Sturla Molden wrote:
...
On 24/06/15 13:43, M.-A. Lemburg wrote:
...
That said, I still think the multiple-process is a better one (more
robust, more compatible, fewer problems). We'd just need a way more
efficient approach to sharing objects between the Python processes
than using pickle and shared memory or pipes :-)
It is hard to get around shared memory, Unix domain sockets, or pipes. There must be some sort of
IPC, regardless.
Sure, but the current approach of pickling Python objects for
communication is just too much overhead in many cases - it
also duplicates the memory requirements when using the multiple
process approach since you eventually end up having n copies of
the same data in memory (with n = number of parallel workers).
...
One idea I have played with is to use a specialized queue instead of the current
multiprocessing.Queue. In scientific computing we often need to pass arrays, so it would make sense
to have a queue that could bypass pickle for NumPy arrays, scalars and dtypes, simply by using the
NumPy C API to process the data. It could also have specialized code for a number of other objects
-- at least str, int, float, complex, and PEP 3118 buffers, but perhaps also simple lists, tuples
and dicts with these types. I think it should be possible to make a queue that would avoid the
pickle issue for 99 % of scientific computing. It would be very easy to write such a queue with
Cython and e.g. have it as a part of NumPy or SciPy.
The tricky part is managing pointers in those data structures,
e.g. a container types for other Python objects will have to
store all referenced objects in the shared memory segment as
well.

For NumPy arrays using simple types this is a lot easier,
since you don't have to deal with pointers to other objects.
...
One thing I did some years ago was to have NumPy arrays that would store the data in shared memory.
And when passed to multiprocessing.Queue they would not pickle the data buffer, only the metadata.
However this did not improve on performance, because the pickle overhead was still there, and
passing a lot of binary data over a pipe was not comparably expensive. So while it would save
memory, it did not make programs using multiprocessing and NumPy more efficient.
When saying "passing a lot of binary data over a pipe" you mean
the meta-data ?

I had discussed the idea of Python object sharing with Larry
Hastings back in 2013, but decided that trying to get
all references of containers managed in the shared memory
would be too fragile an approach to pursue further.

Still, after some more research later that year, I found that
someone already had investigated the idea in 2003:

    http://poshmodule.sourceforge.net/

Reading the paper on this:

    http://poshmodule.sourceforge.net/posh/posh.pdf

made me wonder why this idea never received more attention in
all these years.

His results are clearly positive and show that the multiple
process approach can provide better scalability than
using threads when combined with shared memory object
storage.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 24 2015)
...
...
...
Python Projects, Coaching and Consulting ...  http://www.egenix.com/
mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/

2015-06-16: Released eGenix pyOpenSSL 0.13.10 ... http://egenix.com/go78
2015-07-20: EuroPython 2015, Bilbao, Spain ...             26 days to go
2015-07-29: Python Meeting Duesseldorf ...                 35 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/