[Python-ideas] Re: [Python-Dev] Making asyncio more thread-friendly

8 Jun 2020

      On Sun, Jun 07, 2020 at 03:47:05PM +0900, Stephen J. Turnbull wrote:
...
Hi, Celelibi,
Welcome to Python Ideas.
Python Dev is more for discussions of implementations of proposed
features, typically clearly on their way to an accepted pull request
into master.  Python-Ideas is a better place for a request for
enhancement without an implementation patch, so I'm moving it here.
If it's just that, I can probably provide a patch for
run_until_complete.
...
...
This makes it difficult to make threads and asyncio coexist in a
single application
True.  Do you know of a programming environment that makes it easy
that we can study?
Unfortunately no.
But that doesn't mean we can't try to design asyncio to be more
thread-friendly.
...
...
Unless I'm mistaken, there's no obvious way to have a function run a
coroutine in a given event loop, no matter if the loop in running or
not, no matter if it's running in the current context or not.
There are at least 3 cases that I can see:
1) The loop isn't running -> run_until_complete is the solution.
2) The loop is running in another context -> we might use
   call_soon_threadsafe and wait for the result.
3) The loop is running in the current context but we're not in a
   coroutine -> there's currently not much we can do without some major
   code change.
What I'd like to propose is that run_until_complete handle all three
cases.
I don't see how the three can be combined safely and generically.  I
can imagine approaches that might work in a specific application, but
the general problem is going to be very hard.  async programming is
awesome, and may be far easier to get 100% right than threading in
some applications, but it has its pitfall too, as Nathaniel Smith
points out:
https://vorpus.org/blog/some-thoughts-on-asynchronous-api-design-in-a-post-a...
I'm not sure what's hard about combining the three cases.
The distinction between the three cases can be made easily with
loop.is_running() and comparing asyncio.get_event_loop() to self.

Implementing the third case can be a bit tricky but should be just a
matter of making sure the loop itself behave nicely with its nested and
parent instances. With a few important notes:
- Infinite recursion is easily possible but it can be mitigated by:
  - Giving transitively a higher priority to tasks the current one is
    awaiting on. (Likely non-trivial to implement.)
  - Emiting a warning when the recursion depth goes beyon a given
    threshold.
- This case as a whole would also likely deserve a warning if it's
  called directly from a coroutine.
...
...
Did I miss an obvious way to make the migration from threads to asyncio
easier?
Depends on what your threads are doing.  If I were in that situation,
I'd probably try to decouple the thread code from the async code by
having the threads feed one or more queues into the async code base,
and vice versa.
...
It becomes harder and harder to not believe that python is purposfully
trying to make this kind of migration more painful.
Again, it would help a lot if you had an example of known working APIs
we could implement, and/or more detail about your code's architecture
to suggest workarounds.
That's the thing. I don't know all the threads nor all the machinery in
the application. If I did, I could probably make stronger assumptions.
The application connects to several servers and relay messages beetween
them. There's at least one thread per connection and one that process
commands.

I'm working on switching out a thread-based client protocol lib for an
asyncio-based lib instead. To make the switch mostly invisible, I run
an event loop in its own thread.

When the server-connection-thread wants to send a message, it calls the
usual method, which I modified to use call_soon_threadsafe to run a
coroutine in the event loop and wait for the result.

When an async callback is triggered by the lib, I just run the same old
code which calls the whole machinery I do not control. By chance, it
never end up wanting to call a coroutine. Therefore no need for a
recursive call to the event loop... apparently...
But if it did, I'd end up calling call_soon_threadsafe from the thread
that runs the event loop and wait for the result. Which would just lock
up. And that's what bugs me the most. To work around that, I would
probably have to create a thread for the whole callback so that the
event loop can get control back immediately. (And possibly create more
threads. With all the bad things this entails.)

Another issue I found is that the client protocol lib class isn't always
instanciated by the same thread, and its constructor indirectly
instanciate asyncio objects (like Event). In the end, its start
method will start the event loop in its own thread.
With the deprecation of the loop argument, Event() finds the loop it
should use by calling get_event_loop() in its constructor. I therefore
had to add a call set_event_loop(), even though this event loop will
never run in this thread. That's a pretty ugly hack.
The deprecation of the loop arguments looks like an incentive to create
the asyncio objects in the threads that use them. Which seems pretty
crazy to me as a whole. And practically, it means that the thread
creation couldn't be part of the class itself, thus incurring more
complexity to the outside code.

As you can see I already have the workarounds I need. They're just
pretty hack-ish. I was also a bit lucky.
And I think these could be made simpler and prettier by implementing the
suggested changes to run_until_complete().

Best regards,
Celelibi