New subject: A more flexible task creation

June 14, 2018

      Hi,

I've been using asyncio a lot lately and have encountered this problem
several times. Imagine you want to do a lot of queries against a database,
spawning 10000 tasks in parallel will probably cause a lot of them to fail.
What you need in a task pool of sorts, to limit concurrency and do only 20
requests in parallel.

If we were doing this synchronously, we wouldn't spawn 10000 threads using
10000 connections, we would use a thread pool with a limited number of
threads and submit the jobs into its queue.

To me, tasks are (somewhat) logically analogous to threads. The solution
that first comes to mind is to create an AsyncioTaskExecutor with a
submit(coro, *args, **kwargs) method. Put a reference to the coroutine and
its arguments into an asyncio queue. Spawn n tasks pulling from this queue
and awaiting the coroutines.

It'd probably be useful to have this in the stdlib at some point.

Date: Wed, 13 Jun 2018 22:45:22 +0200
...
From: Michel Desmoulin <desmoulinmichel@gmail.com>
To: python-dev@python.org
Subject: [Python-Dev] A more flexible task creation
Message-ID: <bca6b319-c436-c8c2-bb0e-6707f0495c49@gmail.com>
Content-Type: text/plain; charset=utf-8
I was working on a concurrency limiting code for asyncio, so the user
may submit as many tasks as one wants, but only a max number of tasks
will be submitted to the event loop at the same time.
However, I wanted that passing an awaitable would always return a task,
no matter if the task was currently scheduled or not. The goal is that
you could add done callbacks to it, decide to force schedule it, etc
I dug in the asyncio.Task code, and encountered:
def __init__(self, coro, *, loop=None):
        ...
        self._loop.call_soon(self._step)
        self.__class__._all_tasks.add(self)
I was surprised to see that instantiating a Task class has any side
effect at all, let alone 2, and one of them being to be immediately
scheduled for execution.
I couldn't find a clean way to do what I wanted: either you
loop.create_task() and you get a task but it runs, or you don't run
anything, but you don't get a nice task object to hold on to.
I tried several alternatives, like returning a future, and binding the
future awaiting to the submission of a task, but that was complicated
code that duplicated a lot of things.
I tried creating a custom task, but it was even harder, setting a custom
event policy, to provide a custom event loop with my own create_task()
accepting parameters. That's a lot to do just to provide a parameter to
Task, especially if you already use a custom event loop (e.g: uvloop). I
was expecting to have to create a task factory only, but task factories
can't get any additional parameters from create_task()).
Additionally I can't use ensure_future(), as it doesn't allow to pass
any parameter to the underlying Task, so if I want to accept any
awaitable in my signature, I need to provide my own custom ensure_future().
All those implementations access a lot of _private_api, and do other
shady things that linters hate; plus they are fragile at best. What's
more, Task being rewritten in C prevents things like setting self._coro,
so we can only inherit from the pure Python slow version.
In the end, I can't even await the lazy task, because it blocks the
entire program.
Hence I have 2 distinct, but independent albeit related, proposals:
- Allow Task to be created but not scheduled for execution, and add a
parameter to ensure_future() and create_task() to control this. Awaiting
such a task would just do like asyncio.sleep(O) until it is scheduled
for execution.
- Add an parameter to ensure_future() and create_task() named "kwargs"
that accept a mapping and will be passed as **kwargs to the underlying
created Task.
I insist on the fact that the 2 proposals are independent, so please
don't reject both if you don't like one or the other. Passing a
parameter to the underlying custom Task is still of value even without
the unscheduled instantiation, and vice versa.
Also, if somebody has any idea on how to make a LazyTask that we can
await on without blocking everything, I'll take it.

Re: [Python-Dev] A more flexible task creation

Tin Tvrtković

Yury Selivanov

Gustavo Carneiro

Michel Desmoulin

Gustavo Carneiro

Michel Desmoulin

Chris Barker

Joni Orponen

Steve Dower

Tin Tvrtković

Nathaniel Smith

Tin Tvrtković

Steve Holden

Yury Selivanov

Gustavo Carneiro

Michel Desmoulin

Gustavo Carneiro

Michel Desmoulin

Chris Barker

Joni Orponen

Steve Dower

Tin Tvrtković

Nathaniel Smith

Tin Tvrtković

Steve Holden

tags

participants (9)