[Python-ideas] Async API

Thu Oct 25 01:26:32 CEST 2012

On 2012-10-24, at 7:12 PM, Guido van Rossum <guido at python.org> wrote:

> On Wed, Oct 24, 2012 at 4:03 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>> Hi Guido,
>> 
>> On 2012-10-24, at 6:43 PM, Guido van Rossum <guido at python.org> wrote:
>>> What's the problem with just letting the cleanup take as long as it
>>> wants to and do whatever it wants? That's how try/finally works in
>>> regular Python code.
> 
>> The problem appears when you add timeouts support.
>> 
>> Let me show you an abstract example (I won't use yield_froms, but I'm
>> sure that the problem is the same with them):
>> 
>>   @coroutine
>>   def fetch_comments(app):
>>       session = yield app.new_session()
>>       try:
>>            return (yield session.query(...))
>>       finally:
>>            yield session.close()
>> 
>> and now we execute that with:
>> 
>>   #: Get a list of comments; throw a TimeoutError if it
>>   #: takes more than 1 second
>>   comments = yield fetch_comments(app).with_timeout(1.0)
>> 
>> Now, scheduler starts with 'fetch_comments', then executes
>> 'new_session', then executes 'session.query' in a round-robin fashion.
>> 
>> Imagine, that database query took a bit less than a second to execute,
>> scheduler pushes the result in coroutine, and then a timeout event occurs.
>> So scheduler throws a 'TimeoutError' in the coroutine, thus preventing
>> the 'session.close' to be executed.  There is no way for a scheduler to
>> understand, that there is no need in pushing the exception right now,
>> as the coroutine is in its finally block.
>> 
>> And this situation is a pretty common when you have such timeouts
>> mechanism in place and widely used.
> 
> Ok, I can understand. But still, this is a problem with timeouts in
> general, not just with timeouts in a yield-based environment. How does
> e.g. Twisted deal with this?

I don't know, I hope someone with an expertise in Twisted can tell us.

But I would imagine that they don't have this particular problem, as it
should be related only to coroutines and schedulers that run them.  I.e.
it's a problem when you run some code and may interrupt it.  And you can't
interrupt a plain python code that uses callbacks without yields and
greenlets.

> As a work-around, I could imagine some kind of with-statement that
> tells the scheduler we're already in the finally clause (it could
> still send you a timeout if your cleanup takes way too long):
> 
> try:
>  yield <regular code>
> finally:
>  with protect_finally():
>    yield <cleanup code>
> 
> Of course this could be abused, but at your own risk -- the scheduler
> only gives you a fixed amount of extra time and then it's quits.

Right, that's the basic approach.  But it also gives you a feeling of
a "broken" language feature.  I.e. we have coroutines, but we can not
implement timeouts on top of them without making 'finally' blocks
look ugly.  And if we assume that you can run any coroutine with a
timeout - you'll need to use 'protect_finally' in virtually every
'finally' statement.

I solved the problem by dynamically inlining 'with protect_finally()'
code in @coroutine decorator (something that I would never suggest to
put in the stdlib, btw).  There is also PEP 419, but I don't like it as
well, as it is tied to frames--two low level (and I'm not sure how it
will work with future CPython optimizations and PyPy's JIT.)

BUT, the concept is nice.  I've implemented a number of protocols with
yield-coroutines, and managing timeouts with a simple ".with_timeout()"
call is a very handy and readable feature.  So, I hope, that we can
all brainstorm this problem to make coroutines "complete", if we decide
to start using them widely.

-
Yury