I actually know very little about multiprocessing (have never used it) but I imagine the way you normally interact with multiprocessing is using a synchronous calls that talk to the subprocesses and their work queues and so on, right?

In the asyncio world you would put that work in a thread and then use run_in_executor() with a thread executor -- the thread would then be managing the subprocesses and talking to them. While you are waiting for that thread to complete your other coroutines will still work.

Unless you want to rewrite the communication and process management as coroutines, but that sounds like a lot of work.


On Sat, Jul 26, 2014 at 1:59 PM, Dan O'Reilly <oreilldf@gmail.com> wrote:
I think it would be helpful for folks using the asyncio module to be able to make non-blocking calls to objects in the multiprocessing module more easily. While some use-cases for using multiprocessing can be replaced with ProcessPoolExecutor/run_in_executor, there are others that cannot; more advanced usages of multiprocessing.Pool aren't supported by ProcessPoolExecutor (initializer/initargs, contexts, etc.), and other multiprocessing classes like Lock and Queue have blocking methods that could be made into coroutines.

Consider this (extremely contrived, but use your imagination) example of a asyncio-friendly Queue:

import asyncio
import time

def do_proc_work(q, val, val2):
    time.sleep(3)  # Imagine this is some expensive CPU work.
    ok = val + val2
    print("Passing {} to parent".format(ok))
    q.put(ok) # The Queue can be used with the normal blocking API, too.
    item = q.get() 
    print("got {} back from parent".format(item))

def do_some_async_io_task():
    # Imagine there's some kind of asynchronous I/O 
    # going on here that utilizes asyncio.
    asyncio.sleep(5)

@asyncio.coroutine
def do_work(q):
    loop.run_in_executor(ProcessPoolExecutor(),
                         do_proc_work, q, 1, 2)
    do_some_async_io_task()
    item = yield from q.coro_get() # Non-blocking get that won't affect our io_task
    print("Got {} from worker".format(item))
    item = item + 25
    yield from q.coro_put(item)


if __name__  == "__main__":
    q = AsyncProcessQueue()  # This is our new asyncio-friendly version of multiprocessing.Queue 
    loop = asyncio.get_event_loop()
    loop.run_until_complete(do_work(q))

I have seen some rumblings about a desire to do this kind of integration on the bug tracker (http://bugs.python.org/issue10037#msg162497 and http://bugs.python.org/issue9248#msg221963) though that discussion is specifically tied to merging the enhancements from the Billiard library into multiprocessing.Pool. Are there still plans to do that? If so, should asyncio integration with multiprocessing be rolled into those plans, or does it make sense to pursue it separately?

Even more generally, do people think this kind of integration is a good idea to begin with? I know using asyncio is primarily about *avoiding* the headaches of concurrent threads/processes, but there are always going to be cases where CPU-intensive work is going to be required in a primarily I/O-bound application. The easier it is to for developers to handle those use-cases, the better, IMO.

Note that the same sort of integration could be done with the threading module, though I think there's a fairly limited use-case for that; most times you'd want to use threads over processes, you could probably just use non-blocking I/O instead.

Thanks,
Dan


_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/



--
--Guido van Rossum (python.org/~guido)