[Python-Dev] A more flexible task creation

Yury Selivanov yselivanov.ml at gmail.com
Wed Jun 13 17:26:56 EDT 2018


On Wed, Jun 13, 2018 at 4:47 PM Michel Desmoulin
<desmoulinmichel at gmail.com> wrote:
>
> I was working on a concurrency limiting code for asyncio, so the user
> may submit as many tasks as one wants, but only a max number of tasks
> will be submitted to the event loop at the same time.

What does that "concurrency limiting code" do?  What problem does it solve?

>
> However, I wanted that passing an awaitable would always return a task,
> no matter if the task was currently scheduled or not. The goal is that
> you could add done callbacks to it, decide to force schedule it, etc

The obvious advice is to create a new class "DelayedTask" with a
Future-like API.  You can then schedule the real awaitable that it
wraps with `loop.create_task` at any point.  Providing
"add_done_callback"-like API is trivial.  DelayedTask can itself be an
awaitable, scheduling itself on a first __await__ call.

As a benefit, your implementation will support any Task-like objects
that alternative asyncio loops can implement. No need to mess with
policies either.

>
> I dug in the asyncio.Task code, and encountered:
>
>     def __init__(self, coro, *, loop=None):
>         ...
>         self._loop.call_soon(self._step)
>         self.__class__._all_tasks.add(self)
>
> I was surprised to see that instantiating a Task class has any side
> effect at all, let alone 2, and one of them being to be immediately
> scheduled for execution.

To be fair, implicitly scheduling a task for execution is what all
async frameworks (twisted, curio, trio) do when you wrap a coroutine
into a task.  I don't recall them having a keyword argument to control
when the task is scheduled.

>
> I couldn't find a clean way to do what I wanted: either you
> loop.create_task() and you get a task but it runs, or you don't run
> anything, but you don't get a nice task object to hold on to.

A clean way is to create a new layer of abstraction (e.g. DelayedTask
I suggested above).

[..]
> I tried creating a custom task, but it was even harder, setting a custom
> event policy, to provide a custom event loop with my own create_task()
> accepting parameters. That's a lot to do just to provide a parameter to
> Task, especially if you already use a custom event loop (e.g: uvloop). I
> was expecting to have to create a task factory only, but task factories
> can't get any additional parameters from create_task()).

I don't think creating a new Task implementation is needed here, a
simple wrapper should work just fine.

[..]
> Hence I have 2 distinct, but independent albeit related, proposals:
>
> - Allow Task to be created but not scheduled for execution, and add a
> parameter to ensure_future() and create_task() to control this. Awaiting
> such a task would just do like asyncio.sleep(O) until it is scheduled
> for execution.
>
> - Add an parameter to ensure_future() and create_task() named "kwargs"
> that accept a mapping and will be passed as **kwargs to the underlying
> created Task.
>
> I insist on the fact that the 2 proposals are independent, so please
> don't reject both if you don't like one or the other. Passing a
> parameter to the underlying custom Task is still of value even without
> the unscheduled instantiation, and vice versa.

Well, to add a 'kwargs' parameter to ensure_future() we need kwargs in
Task.__init__.  So far we only have 'loop' and it's not something that
ensure_future() should allow you to override.  So unless we implement
the first proposal, we don't need the second.

Yury


More information about the Python-Dev mailing list