[Python-Dev] A more flexible task creation

Michel Desmoulin desmoulinmichel at gmail.com
Wed Jun 13 16:45:22 EDT 2018


I was working on a concurrency limiting code for asyncio, so the user
may submit as many tasks as one wants, but only a max number of tasks
will be submitted to the event loop at the same time.

However, I wanted that passing an awaitable would always return a task,
no matter if the task was currently scheduled or not. The goal is that
you could add done callbacks to it, decide to force schedule it, etc

I dug in the asyncio.Task code, and encountered:

    def __init__(self, coro, *, loop=None):
        ...
        self._loop.call_soon(self._step)
        self.__class__._all_tasks.add(self)

I was surprised to see that instantiating a Task class has any side
effect at all, let alone 2, and one of them being to be immediately
scheduled for execution.

I couldn't find a clean way to do what I wanted: either you
loop.create_task() and you get a task but it runs, or you don't run
anything, but you don't get a nice task object to hold on to.

I tried several alternatives, like returning a future, and binding the
future awaiting to the submission of a task, but that was complicated
code that duplicated a lot of things.

I tried creating a custom task, but it was even harder, setting a custom
event policy, to provide a custom event loop with my own create_task()
accepting parameters. That's a lot to do just to provide a parameter to
Task, especially if you already use a custom event loop (e.g: uvloop). I
was expecting to have to create a task factory only, but task factories
can't get any additional parameters from create_task()).

Additionally I can't use ensure_future(), as it doesn't allow to pass
any parameter to the underlying Task, so if I want to accept any
awaitable in my signature, I need to provide my own custom ensure_future().

All those implementations access a lot of _private_api, and do other
shady things that linters hate; plus they are fragile at best. What's
more, Task being rewritten in C prevents things like setting self._coro,
so we can only inherit from the pure Python slow version.

In the end, I can't even await the lazy task, because it blocks the
entire program.

Hence I have 2 distinct, but independent albeit related, proposals:

- Allow Task to be created but not scheduled for execution, and add a
parameter to ensure_future() and create_task() to control this. Awaiting
such a task would just do like asyncio.sleep(O) until it is scheduled
for execution.

- Add an parameter to ensure_future() and create_task() named "kwargs"
that accept a mapping and will be passed as **kwargs to the underlying
created Task.

I insist on the fact that the 2 proposals are independent, so please
don't reject both if you don't like one or the other. Passing a
parameter to the underlying custom Task is still of value even without
the unscheduled instantiation, and vice versa.

Also, if somebody has any idea on how to make a LazyTask that we can
await on without blocking everything, I'll take it.





More information about the Python-Dev mailing list