<div dir="ltr">Hi all<div><br></div><div>I am trying to use asyncio in real applications and it doesn't go that easy, a help of asyncio gurus is needed badly.</div><div><br></div><div>Consider a task like crawling the web starting from some web-sites. Each site leads to generation of new downloading tasks in exponential(!) progression. However we don't want neither to flood the event loop nor to overload our network. We'd like to control the task flow. This is what I achieve well with modification of nice Maxime's solution proposed here:</div>
<div><a href="https://mail.python.org/pipermail/python-list/2014-July/675048.html">https://mail.python.org/pipermail/python-list/2014-July/675048.html</a><br></div><div><br></div><div>Well, but I'd need as well a very natural thing, kind of map() & reduce() or functools.reduce() if we are on python3 already. That is, I'd need to call a "summarizing" function for all the downloading tasks completed on links from a page. This is where i fail :(</div>
<div><br></div><div>I'd propose an oversimplified but still a nice test to model the use case:</div><div>Let's use fibonacci function implementation in its ineffective form.</div><div>That is, let the coro_sum() be our reduce() function and coro_fib be our map().</div>
<div>Something like this:</div><div><br></div><div><div>@asyncio.coroutine</div><div>def coro_sum(x):</div><div> return sum(x)</div><div><br></div><div>@asyncio.coroutine</div><div>def coro_fib(x): </div><div> if x < 2:</div>
<div> return 1</div><div> res_coro = executor_pool.spawn_task_when_arg_list_of_coros_ready(coro=coro_sum,</div><div> arg_coro_list=[coro_fib(x - 1), coro_fib(x - 2)])</div>
<div> return res_coro<br></div><div> <br></div></div><div>So that we could run the following tests.</div><div><br></div><div>Test #1 on one worker:</div><div><br></div><div> executor_pool = ExecutorPool(workers=1)</div>
<div> executor_pool.as_completed( coro_fib(x) for x in range(20) )<br></div><div><br></div><div>Test #2 on two workers:</div><div><div> executor_pool = ExecutorPool(workers=2)</div><div> executor_pool.as_completed( coro_fib(x) for x in range(20) )<br>
</div></div><div><br></div><div>It would be very important that both each coro_fib() and coro_sum() invocations are done via a Task on some worker, not just spawned implicitly and unmanaged!</div><div><br></div><div>It would be cool to find asyncio gurus interested in this very natural goal.</div>
<div>Your help and ideas would be very much appreciated.</div><div><br clear="all"><div>best regards<br>--<br>Valery </div>
</div></div>