
On 24.06.2015 18:58, Sturla Molden wrote:
On 24/06/15 13:43, M.-A. Lemburg wrote:
That said, I still think the multiple-process is a better one (more robust, more compatible, fewer problems). We'd just need a way more efficient approach to sharing objects between the Python processes than using pickle and shared memory or pipes :-)
It is hard to get around shared memory, Unix domain sockets, or pipes. There must be some sort of IPC, regardless.
Sure, but the current approach of pickling Python objects for communication is just too much overhead in many cases - it also duplicates the memory requirements when using the multiple process approach since you eventually end up having n copies of the same data in memory (with n = number of parallel workers).
One idea I have played with is to use a specialized queue instead of the current multiprocessing.Queue. In scientific computing we often need to pass arrays, so it would make sense to have a queue that could bypass pickle for NumPy arrays, scalars and dtypes, simply by using the NumPy C API to process the data. It could also have specialized code for a number of other objects -- at least str, int, float, complex, and PEP 3118 buffers, but perhaps also simple lists, tuples and dicts with these types. I think it should be possible to make a queue that would avoid the pickle issue for 99 % of scientific computing. It would be very easy to write such a queue with Cython and e.g. have it as a part of NumPy or SciPy.
The tricky part is managing pointers in those data structures, e.g. a container types for other Python objects will have to store all referenced objects in the shared memory segment as well. For NumPy arrays using simple types this is a lot easier, since you don't have to deal with pointers to other objects.
One thing I did some years ago was to have NumPy arrays that would store the data in shared memory. And when passed to multiprocessing.Queue they would not pickle the data buffer, only the metadata. However this did not improve on performance, because the pickle overhead was still there, and passing a lot of binary data over a pipe was not comparably expensive. So while it would save memory, it did not make programs using multiprocessing and NumPy more efficient.
When saying "passing a lot of binary data over a pipe" you mean the meta-data ? I had discussed the idea of Python object sharing with Larry Hastings back in 2013, but decided that trying to get all references of containers managed in the shared memory would be too fragile an approach to pursue further. Still, after some more research later that year, I found that someone already had investigated the idea in 2003: http://poshmodule.sourceforge.net/ Reading the paper on this: http://poshmodule.sourceforge.net/posh/posh.pdf made me wonder why this idea never received more attention in all these years. His results are clearly positive and show that the multiple process approach can provide better scalability than using threads when combined with shared memory object storage. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 24 2015)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2015-06-16: Released eGenix pyOpenSSL 0.13.10 ... http://egenix.com/go78 2015-07-20: EuroPython 2015, Bilbao, Spain ... 26 days to go 2015-07-29: Python Meeting Duesseldorf ... 35 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/