Add «iterate non-blocking» wrapper to prevent blocking loop too long
At work we faced a problem of long running python code. Our case was a short task, but huge count of iterations. Something like: for x in data_list: # do 1ms non-io pure python task So we block loop for more than 100ms, or even 1000ms. The first naive solution was "move this to thread" so python will switch thread context and asyncio loop will not be blocked, but this raised two problems: * If we do asyncio things in our task (create `Future` in our case), then we need to pass loop explicitly and use `call_soon_threadsafe` * We still saw asyncio warnings about blocking the loop. Not sure why, but maybe because of GIL was not released when main/asyncio thread became active. We endend up with wrapper for iterable, which switch «asyncio context» via `asyncio.sleep(0)` (since we know that sleep(0) have special code, which just switches context) by time or by count. Here is our code: async def iterate_non_blocking(iterable, context_switch_interval=0.01, context_switch_count=None): last_context_switch_time = time.perf_counter() for i, value in enumerate(iterable, start=1): yield value switch_context_by_interval = context_switch_interval and \ (time.perf_counter() - last_context_switch_time) >= context_switch_interval switch_context_by_count = context_switch_count and i % context_switch_count == 0 if switch_context_by_interval or switch_context_by_count: await asyncio.sleep(0) last_context_switch_time = time.perf_counter() I'm not sure if this is a good approach, but this solves all our problems for this kind of blocking cases. Here comes discussable things: * Is there a better approach for such cases? * Should this be a part of asyncio? * Should this be a part of documentation as recipe?
**Sorry, did not know markdown is supported** At work we faced a problem of long running python code. Our case was a short task, but huge count of iterations. Something like: ``` python for x in data_list: # do 1ms non-io pure python task ``` So we block loop for more than 100ms, or even 1000ms. The first naive solution was "move this to thread" so python will switch thread context and asyncio loop will not be blocked, but this raised two problems: * If we do asyncio things in our task (create Future in our case), then we need to pass loop explicitly and use call_soon_threadsafe * We still saw asyncio warnings about blocking the loop. Not sure why, but maybe because of GIL was not released when main/asyncio thread became active. We endend up with wrapper for iterable, which switch «asyncio context» via asyncio.sleep(0) (since we know that sleep(0) have special code, which just switches context) by time or by count. Here is our code: ``` python async def iterate_non_blocking(iterable, context_switch_interval=0.01, context_switch_count=None): last_context_switch_time = time.perf_counter() for i, value in enumerate(iterable, start=1): yield value switch_context_by_interval = context_switch_interval and \ (time.perf_counter() - last_context_switch_time) >= context_switch_interval switch_context_by_count = context_switch_count and i % context_switch_count == 0 if switch_context_by_interval or switch_context_by_count: await asyncio.sleep(0) last_context_switch_time = time.perf_counter() ``` I'm not sure if this is a good approach, but this solves all our problems for this kind of blocking cases. Here comes discussable things: * Is there a better approach for such cases? * Should this be a part of asyncio? * Should this be a part of documentation as recipe?
On Fri, 14 Jun 2019 at 11:38, Nikita Melentev
**Sorry, did not know markdown is supported**
Oh cool! It's not "supported" in the sense that this is a mailing list, and whether your client renders markdown is client-dependent (mine doesn't for example). But it looks like Mailman3 does, and with the use of the "Message archived at" footer, it's really easy to see a rendered version if you need to. Another nice feature of Mailman3/Hyperkitty :-) Paul
I may say something stupid but aren’t coroutines exactly what you are looking for ? Le ven. 14 juin 2019 à 13:07, Paul Moore
On Fri, 14 Jun 2019 at 11:38, Nikita Melentev
wrote: **Sorry, did not know markdown is supported**
Oh cool! It's not "supported" in the sense that this is a mailing list, and whether your client renders markdown is client-dependent (mine doesn't for example). But it looks like Mailman3 does, and with the use of the "Message archived at" footer, it's really easy to see a rendered version if you need to.
Another nice feature of Mailman3/Hyperkitty :-)
Paul _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3JBTPP... Code of Conduct: http://python.org/psf/codeofconduct/
The problem here is that even if I have a coroutine all code between «awaits» is blocking. ``` python async def foo(): data = await connection.get() # it is ok, loop handling request, we waiting # from here for item in data: # this is 10 ** 6 len do_sync_jon(item) # this took 1ms # to here we are blocking loop for 1 second await something_next() ```
On Fri, Jun 14, 2019 at 10:44 PM Nikita Melentev
The problem here is that even if I have a coroutine all code between «awaits» is blocking.
Isn't that kinda the point of coroutines? If you want more yield points, you either insert more awaits, or you use threads instead. ChrisA
That is exaclty the point of coroutines, but as I described above, there are cases, where blocking code is too long and moving it to thread makes it harder to use.
So -
Now thinking on the problem as a whole -
I think maybe a good way to address this is to put the logic on
"counting N interations or X time and allowing switch" - the logic you
had to explicitly mingle in your code in the first example, in
a function that could wrap the iterator of the `for` loop.
However, that idea would need a _synchronous_ call to the loop
to force an async context switch - I don't know if
there is such a call (even an internal one). Actually
I don't know it that is possible - but I can't think of other way of
factoring it out without explicitly triggering an await.
The switcher would then be used just as we use `enumerate` -
think of something along:
for context_switch, data_set in async_switcher(data_sets, timeout=100):
for record in context_switch(data_set):
# the call to __iter__ here would include the logic to decide
whether to switch,
...
And for a single (non-nested) loop, either a separate call or:
for data_set in next(async_switcher(data_sets, timeout=100))[0]:
...
All in all: if there is a valid way to force the async-context switch in an
sync-call to this wrapper object, it is possible to create a small package
that would have this feature and be easy to use.
(And then, we discuss further down if this is stdlib worthy)
On Fri, 14 Jun 2019 at 09:45, Nikita Melentev
The problem here is that even if I have a coroutine all code between «awaits» is blocking.
``` python async def foo(): data = await connection.get() # it is ok, loop handling request, we waiting # from here for item in data: # this is 10 ** 6 len do_sync_jon(item) # this took 1ms # to here we are blocking loop for 1 second await something_next() ``` _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BLLU54... Code of Conduct: http://python.org/psf/codeofconduct/
I'm not sure this is a good approach. For me `async for` is just best way, since it is explicit. Whe you see `async for`, you think «alright, context will switch somewhere inside, I am aware of this». If I get you right though.
Despite my concerns over code for an implementation on my previous e-mail,
it turns out that simply iterating in an `async for` loop won't yield
to the asyncio-loop. An explicit "await" inside the async-generator
is needed for that.
That makes factoring-out the code presented in the first e-mail
in this thread somewhat trivial - and it will read something like:
```
async def switch():
await asyncio.sleep(0)
async def iterate_non_blocking(iterator, timeout=None, iterations=None):
if timeout is None and iterations is None:
timeout = 0.1
ts = time.time()
counter = 0
for item in iterator:
yield item
counter += 1
if iterations and (counter >= iterations) or timeout and
time.time() - ts >= timeout:
await switch()
counter = 0
ts = time.time()
```
Which can then be used like:
```
async for i in iterate_non_blocking(range(steps), iterations=30):
...
```
I've created a gist with this code at
https://gist.github.com/jsbueno/ae6001c55ee3ff001fb7c152b1f109b2
And if people agree it is imporant enough, I am ok with creating a
full-package with this code,
or help adding it to the stdlib.
On Fri, 14 Jun 2019 at 11:58, Joao S. O. Bueno
So -
Now thinking on the problem as a whole - I think maybe a good way to address this is to put the logic on "counting N interations or X time and allowing switch" - the logic you had to explicitly mingle in your code in the first example, in a function that could wrap the iterator of the `for` loop.
However, that idea would need a _synchronous_ call to the loop to force an async context switch - I don't know if there is such a call (even an internal one). Actually I don't know it that is possible - but I can't think of other way of factoring it out without explicitly triggering an await.
The switcher would then be used just as we use `enumerate` - think of something along:
for context_switch, data_set in async_switcher(data_sets, timeout=100): for record in context_switch(data_set): # the call to __iter__ here would include the logic to decide whether to switch, ...
And for a single (non-nested) loop, either a separate call or:
for data_set in next(async_switcher(data_sets, timeout=100))[0]: ...
All in all: if there is a valid way to force the async-context switch in an sync-call to this wrapper object, it is possible to create a small package that would have this feature and be easy to use. (And then, we discuss further down if this is stdlib worthy)
On Fri, 14 Jun 2019 at 09:45, Nikita Melentev
wrote: The problem here is that even if I have a coroutine all code between «awaits» is blocking.
``` python async def foo(): data = await connection.get() # it is ok, loop handling request, we waiting # from here for item in data: # this is 10 ** 6 len do_sync_jon(item) # this took 1ms # to here we are blocking loop for 1 second await something_next() ``` _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BLLU54... Code of Conduct: http://python.org/psf/codeofconduct/
Not sure how asyncio can help in this case.
It has a warning in debug mode already.
Adding `await asyncio.sleep(0)` is the correct fix for your case.
I don't think that the code should be a part of asyncio.
A recipe is a good idea maybe, not sure. The problem is that the
snippet itself is not very helpful.
Documentation should say that long-running code in async functions
should be avoided.
It requires a good clarification of what long-running code is.
I'm not a documentation writing expert, especially for such not very
obvious areas.
On Fri, Jun 14, 2019 at 1:32 PM Nikita Melentev
At work we faced a problem of long running python code. Our case was a short task, but huge count of iterations. Something like:
for x in data_list: # do 1ms non-io pure python task
So we block loop for more than 100ms, or even 1000ms. The first naive solution was "move this to thread" so python will switch thread context and asyncio loop will not be blocked, but this raised two problems: * If we do asyncio things in our task (create `Future` in our case), then we need to pass loop explicitly and use `call_soon_threadsafe` * We still saw asyncio warnings about blocking the loop. Not sure why, but maybe because of GIL was not released when main/asyncio thread became active.
We endend up with wrapper for iterable, which switch «asyncio context» via `asyncio.sleep(0)` (since we know that sleep(0) have special code, which just switches context) by time or by count. Here is our code:
async def iterate_non_blocking(iterable, context_switch_interval=0.01, context_switch_count=None): last_context_switch_time = time.perf_counter() for i, value in enumerate(iterable, start=1): yield value switch_context_by_interval = context_switch_interval and \ (time.perf_counter() - last_context_switch_time) >= context_switch_interval switch_context_by_count = context_switch_count and i % context_switch_count == 0 if switch_context_by_interval or switch_context_by_count: await asyncio.sleep(0) last_context_switch_time = time.perf_counter()
I'm not sure if this is a good approach, but this solves all our problems for this kind of blocking cases. Here comes discussable things: * Is there a better approach for such cases? * Should this be a part of asyncio? * Should this be a part of documentation as recipe? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/EZWJS6... Code of Conduct: http://python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
The problem is that the snippet itself is not very helpful.
Explain please. The good thing is that, if this snippet will be somewhere (asyncio or docs), then user will not decide by its own about "what is a long running task", because good default value will be there. This also reduce time to fix such cases, sometimes when you write code you don't expect huge data in your loops, but if your case allows to "interrupt" your loop by context switch, then you can just use wrapper and forget about this.
On Fri, 14 Jun 2019 at 12:00, Nikita Melentev
The problem is that the snippet itself is not very helpful.
Explain please.
The good thing is that, if this snippet will be somewhere (asyncio or docs), then user will not decide by its own about "what is a long running task", because good default value will be there. This also reduce time to fix such cases, sometimes when you write code you don't expect huge data in your loops, but if your case allows to "interrupt" your loop by context switch, then you can just use wrapper and forget about this.
The problem is that there is no good universal default for when to yield. Yielding too frequently will make the program run slower, yielding too infrequently will cause latency in coroutine switching (the program you try to solve). The problem gets more complicated with nested loops. Do you yield in all loop nested levels? If not, only the outer one, only the inner one? That said, your code snippet is a nice tool to have in the toolbox. I've written something similar but not as complete: my version just yielded after N iterations, your version also allows yielding after X seconds, which is nice. I almost wish such functionality could be built into Python. But I'm not sure it can be done without performance penalty. _______________________________________________
Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/Y6BW6U... Code of Conduct: http://python.org/psf/codeofconduct/
-- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
Fortunately, asyncio provide this good universal default: 100ms, when WARING appears. Nested loops can be solved with context manager, which will share `last_context_switch_time` between loops. But main thing here is that this is strictly optional, and when someone will use this thing he will know what it is and why he need this.
Regardless of a mechanism to counting time, and etc...
Maybe a plain and simple adition to asincio would be a
context-switching call that does what `asyncio.sleep(0)` does today?
It would feel better to write something like
`await asyncio.switch()` than an arbitrary `sleep`.
On Fri, 14 Jun 2019 at 10:27, Nikita Melentev
Fortunately, asyncio provide this good universal default: 100ms, when WARING appears. Nested loops can be solved with context manager, which will share `last_context_switch_time` between loops. But main thing here is that this is strictly optional, and when someone will use this thing he will know what it is and why he need this. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UT5LEW... Code of Conduct: http://python.org/psf/codeofconduct/
time.sleep(0) is used for a thread context switch, it is very well
known feature.
await asyncio.sleep(0) does the same for async tasks.
Why do we need another API?
On Fri, Jun 14, 2019 at 4:43 PM Joao S. O. Bueno
Regardless of a mechanism to counting time, and etc...
Maybe a plain and simple adition to asincio would be a context-switching call that does what `asyncio.sleep(0)` does today?
It would feel better to write something like `await asyncio.switch()` than an arbitrary `sleep`.
On Fri, 14 Jun 2019 at 10:27, Nikita Melentev
wrote: Fortunately, asyncio provide this good universal default: 100ms, when WARING appears. Nested loops can be solved with context manager, which will share `last_context_switch_time` between loops. But main thing here is that this is strictly optional, and when someone will use this thing he will know what it is and why he need this. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UT5LEW... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4ZTLXB... Code of Conduct: http://python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
it is very well known feature.
Or is it? Just because you do know it, it does not mean it is universal -
This is not documented on time.sleep, threading.Thread, or asyncio.sleep
anyway.
I've never worked much on explicitly multi-threaded code, but in 15+ years
this is a pattern I had not seem up to today.
The original writer of this Thread, Nikita, also does not seem
to have found `asyncio.sleep(0)` in less than 10 minutes looking
for what he needed.
It allows for "explcit is better than implicit", and really
asserts the intention of the code writer, and have
a low cost to implement in the stdlib.
And it is semantically more correct at the cost of maybe 3loc on the
stdlib.
On Fri, 14 Jun 2019 at 10:57, Andrew Svetlov
time.sleep(0) is used for a thread context switch, it is very well known feature. await asyncio.sleep(0) does the same for async tasks. Why do we need another API?
On Fri, Jun 14, 2019 at 4:43 PM Joao S. O. Bueno
wrote: Regardless of a mechanism to counting time, and etc...
Maybe a plain and simple adition to asincio would be a context-switching call that does what `asyncio.sleep(0)` does today?
It would feel better to write something like `await asyncio.switch()` than an arbitrary `sleep`.
On Fri, 14 Jun 2019 at 10:27, Nikita Melentev
Fortunately, asyncio provide this good universal default: 100ms, when
WARING appears. Nested loops can be solved with context manager, which will share `last_context_switch_time` between loops. But main thing here is that
wrote: this is strictly optional, and when someone will use this thing he will know what it is and why he need this.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UT5LEW... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4ZTLXB... Code of Conduct: http://python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
We need either both `asyncio.switch()` and `time.switch()`
(`threading.switch()` maybe) or none of them.
https://docs.python.org/3/library/asyncio-task.html#sleeping has the
explicit sentence:
sleep() always suspends the current task, allowing other tasks to run.
On Fri, Jun 14, 2019 at 5:06 PM Joao S. O. Bueno
it is very well known feature.
Or is it? Just because you do know it, it does not mean it is universal - This is not documented on time.sleep, threading.Thread, or asyncio.sleep anyway.
I've never worked much on explicitly multi-threaded code, but in 15+ years this is a pattern I had not seem up to today. The original writer of this Thread, Nikita, also does not seem to have found `asyncio.sleep(0)` in less than 10 minutes looking for what he needed.
It allows for "explcit is better than implicit", and really asserts the intention of the code writer, and have a low cost to implement in the stdlib.
And it is semantically more correct at the cost of maybe 3loc on the stdlib.
On Fri, 14 Jun 2019 at 10:57, Andrew Svetlov
wrote: time.sleep(0) is used for a thread context switch, it is very well known feature. await asyncio.sleep(0) does the same for async tasks. Why do we need another API?
On Fri, Jun 14, 2019 at 4:43 PM Joao S. O. Bueno
wrote: Regardless of a mechanism to counting time, and etc...
Maybe a plain and simple adition to asincio would be a context-switching call that does what `asyncio.sleep(0)` does today?
It would feel better to write something like `await asyncio.switch()` than an arbitrary `sleep`.
On Fri, 14 Jun 2019 at 10:27, Nikita Melentev
wrote: Fortunately, asyncio provide this good universal default: 100ms, when WARING appears. Nested loops can be solved with context manager, which will share `last_context_switch_time` between loops. But main thing here is that this is strictly optional, and when someone will use this thing he will know what it is and why he need this. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UT5LEW... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4ZTLXB... Code of Conduct: http://python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
-- Thanks, Andrew Svetlov
On Fri, 14 Jun 2019 at 11:20, Andrew Svetlov
We need either both `asyncio.switch()` and `time.switch()` (`threading.switch()` maybe) or none of them.
https://docs.python.org/3/library/asyncio-task.html#sleeping has the explicit sentence:
sleep() always suspends the current task, allowing other tasks to run.
Ok, I agree this, or an improved version of this sentence to explicitly illustrate the "sleep(0)" case is enough.
On Fri, Jun 14, 2019 at 5:06 PM Joao S. O. Bueno
wrote: it is very well known feature.
Or is it? Just because you do know it, it does not mean it is universal - This is not documented on time.sleep, threading.Thread, or asyncio.sleep
anyway.
I've never worked much on explicitly multi-threaded code, but in 15+
this is a pattern I had not seem up to today. The original writer of this Thread, Nikita, also does not seem to have found `asyncio.sleep(0)` in less than 10 minutes looking for what he needed.
It allows for "explcit is better than implicit", and really asserts the intention of the code writer, and have a low cost to implement in the stdlib.
And it is semantically more correct at the cost of maybe 3loc on the stdlib.
On Fri, 14 Jun 2019 at 10:57, Andrew Svetlov
wrote: time.sleep(0) is used for a thread context switch, it is very well known feature. await asyncio.sleep(0) does the same for async tasks. Why do we need another API?
On Fri, Jun 14, 2019 at 4:43 PM Joao S. O. Bueno
wrote:
Regardless of a mechanism to counting time, and etc...
Maybe a plain and simple adition to asincio would be a context-switching call that does what `asyncio.sleep(0)` does today?
It would feel better to write something like `await asyncio.switch()` than an arbitrary `sleep`.
On Fri, 14 Jun 2019 at 10:27, Nikita Melentev <
multisosnooley@gmail.com> wrote:
Fortunately, asyncio provide this good universal default: 100ms,
when WARING appears. Nested loops can be solved with context manager, which will share `last_context_switch_time` between loops. But main thing here is
years that this is strictly optional, and when someone will use this thing he will know what it is and why he need this.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UT5LEW... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4ZTLXB... Code of Conduct: http://python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
-- Thanks, Andrew Svetlov
I think the main point here is that, yes, it is known that `await
asyncio.sleep(0)` yields control allowing other tasks to run.
But imagine you have a for loop with 1000 iterations, and which normally
completes in 1 second. Which means that each iteration takes 1ms to
complete (maybe it does some simple calculation).
Now add `await asyncio.sleep(0)` to the for loop body. You will notice
that the same loop will take much longer to complete, even if there are no
other tasks running, because of the added overhead of switching context to
the main event loop 1000 times.
So there is no simple alternative here:
1. If you don't yield in the for loop body, then you are blocking the main
loop for 1 second;
2. If you yield in every iteration, you solved the task switch latency
problem, but you make the entire program run much slower.
I don't know for sure if asyncio can solve this problem, but I at least am
willing to admit this is a real problem.
I've been through it myself, many times. It's a tough problem to solve,
actually. Requires lots of monitoring in production, to find what are the
tasks that are causing these latency problems.
On Fri, 14 Jun 2019 at 15:23, Andrew Svetlov
We need either both `asyncio.switch()` and `time.switch()` (`threading.switch()` maybe) or none of them.
https://docs.python.org/3/library/asyncio-task.html#sleeping has the explicit sentence:
sleep() always suspends the current task, allowing other tasks to run.
On Fri, Jun 14, 2019 at 5:06 PM Joao S. O. Bueno
wrote: it is very well known feature.
Or is it? Just because you do know it, it does not mean it is universal - This is not documented on time.sleep, threading.Thread, or asyncio.sleep
anyway.
I've never worked much on explicitly multi-threaded code, but in 15+
this is a pattern I had not seem up to today. The original writer of this Thread, Nikita, also does not seem to have found `asyncio.sleep(0)` in less than 10 minutes looking for what he needed.
It allows for "explcit is better than implicit", and really asserts the intention of the code writer, and have a low cost to implement in the stdlib.
And it is semantically more correct at the cost of maybe 3loc on the stdlib.
On Fri, 14 Jun 2019 at 10:57, Andrew Svetlov
wrote: time.sleep(0) is used for a thread context switch, it is very well known feature. await asyncio.sleep(0) does the same for async tasks. Why do we need another API?
On Fri, Jun 14, 2019 at 4:43 PM Joao S. O. Bueno
wrote:
Regardless of a mechanism to counting time, and etc...
Maybe a plain and simple adition to asincio would be a context-switching call that does what `asyncio.sleep(0)` does today?
It would feel better to write something like `await asyncio.switch()` than an arbitrary `sleep`.
On Fri, 14 Jun 2019 at 10:27, Nikita Melentev <
multisosnooley@gmail.com> wrote:
Fortunately, asyncio provide this good universal default: 100ms,
when WARING appears. Nested loops can be solved with context manager, which will share `last_context_switch_time` between loops. But main thing here is
years that this is strictly optional, and when someone will use this thing he will know what it is and why he need this.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UT5LEW... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4ZTLXB... Code of Conduct: http://python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
-- Thanks, Andrew Svetlov _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/DZCOVA... Code of Conduct: http://python.org/psf/codeofconduct/
-- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
Exactly our case! My position is same as njsmith (AFAIR) said somewhere about running file-io in threads: yes, it is faster to write chunk directly from coroutine, than write chunk from executor, but you guarantee that there will be no «freeze».
The real problem is: you have a long-running synchronous loop (or
CPU-bound task in general).
The real solution is: run it inside a thread pool.
By explicit inserting context switches, you don't eliminate the
problem but hide it.
asyncio loop is still busy on handling your CPU-bound task, it
decreases the whole system response time.
Imagine you have 1000 tasks, each pauses the execution by 10ms at most.
The loop can be paused by 1000*10ms = 10 sec if all tasks decide to
switch at the same time.
You cannot control it, asyncio uses non-preemptible strategy for tasks
switching.
Preemtible strategy is called multi-threading and requires
participation with OS kernel. Period.
On Fri, Jun 14, 2019 at 6:27 PM Nikita Melentev
Exactly our case! My position is same as njsmith (AFAIR) said somewhere about running file-io in threads: yes, it is faster to write chunk directly from coroutine, than write chunk from executor, but you guarantee that there will be no «freeze». _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TLYGM3... Code of Conduct: http://python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
Are you sure in your calculations? If we have 1000 task switches at the "same time", then "after" one task start to do the job, then after 10ms it will `sleep(0)` and loop will have time to choose next task. Why loop will be paused in this case?
Because we have 1000 tasks scheduled for execution on the next loop iteration.
First consumes 10ms and pauses (switches context).
The next task is executed *in the same loop iteration*, it consumes
own 10ms and switches.
The same is repeated for all 1000 tasks in *the same loop iteration*,
I want to stress this fact.
10 sec as the result (really even a little more, asyncio need time to
execute self code too).
On Fri, Jun 14, 2019 at 6:47 PM Nikita Melentev
Are you sure in your calculations? If we have 1000 task switches at the "same time", then "after" one task start to do the job, then after 10ms it will `sleep(0)` and loop will have time to choose next task. Why loop will be paused in this case? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/XLXI57... Code of Conduct: http://python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
Oh, I see. Thank you for clarification. In this case such wrapper is useless, unfortunatelly.
Gustavo Carneiro wrote:
1. If you don't yield in the for loop body, then you are blocking the main loop for 1 second;
2. If you yield in every iteration, you solved the task switch latency problem, but you make the entire program run much slower.
It sounds to me like asyncio is the wrong tool for this job. You want a background task that can be preempted by a foreground task. That's what threads are for. Asyncio gives you non-preemptive task scheduling. -- Greg
On Sat, 15 Jun 2019 at 00:26, Greg Ewing
Gustavo Carneiro wrote:
1. If you don't yield in the for loop body, then you are blocking the main loop for 1 second;
2. If you yield in every iteration, you solved the task switch latency problem, but you make the entire program run much slower.
It sounds to me like asyncio is the wrong tool for this job. You want a background task that can be preempted by a foreground task. That's what threads are for. Asyncio gives you non-preemptive task scheduling.
Perhaps. But using threads is more complicated. You have to worry about the integrity of your data in the face of concurrent threads. And if inside your task you sometimes need to call async coroutine code, again you need to be extra careful, you can't just call coroutines from threads directly. But I think you do have a point. If a developer starts reaching for `await asyncio.sleep(0)` too often, perhaps it is time to start considering running that code in the default thread pool executor, whatever the cost may be.
-- Greg _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RLGF5N... Code of Conduct: http://python.org/psf/codeofconduct/
-- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
On 15 Jun 2019, at 10:55, Gustavo Carneiro
wrote: Perhaps. But using threads is more complicated. You have to worry about the integrity of your data in the face of concurrent threads. And if inside your task you sometimes need to call async coroutine code, again you need to be extra careful, you can't just call coroutines from threads directly.
I work on a large product that uses the twisted async framework. And I can assure you the data integrity problem exists in async code as well. At least in twisted it is easy enough to deferToThread() a background task that once complete will continue processing on the foreground thread, we do that for heavy file IO tasks. But we offload compute tasks to other processes as using threads in python does not work because of the GIL and we have to meet latency targets. Barry
participants (9)
-
Adrien Ricocotam
-
Andrew Svetlov
-
Barry Scott
-
Chris Angelico
-
Greg Ewing
-
Gustavo Carneiro
-
Joao S. O. Bueno
-
Nikita Melentev
-
Paul Moore