[Python-ideas] Protecting finally clauses of interruptions
Paul Colomiets
paul at colomiets.name
Wed Apr 4 20:44:30 CEST 2012
Hi Yury,
On Wed, Apr 4, 2012 at 7:59 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>> Here is more detailed previous example (although, still simplified):
>>
>> @coroutine
>> def add_money(user_id, money):
>> yield redis_lock(user_id)
>> try:
>> yield redis_incr('user:'+user_id+':money', money)
>> finally:
>> yield redis_unlock(user_id)
>>
>> # this one is crucial to show the point of discusssion
>> # other function are similar:
>> @coroutine
>> def redis_unlock(lock):
>> yield redis_socket.wait_write() # yields back when socket is
>> ready for writing
>> cmd = ('DEL user:'+lock+'\n').encode('ascii')
>> redis_socket.write(cmd) # should be loop here, actually
>> yield redis_socket.wait_read()
>> result = redis_socket.read(1024) # here loop too
>> assert result == 'OK\n'
>>
>> The trampoline when gets coroutine from `next()` or `send()` method
>> puts it on top of stack and doesn't dispatch original one until topmost
>> one is exited.
>>
>> The point is that if timeout arrives inside a `redis_unlock` function, we
>> must wait until finally from `add_user` is finished
>
> How can it "arrive" inside "redis_unlock"? Let's assume you called
> "add_money" as such:
>
> yield add_money(1, 10).with_timeout(10)
>
> Then it's the 'add_money' coroutine that should be in the tieouts queue/tree!
> 'add_money' specifically should be tried to be interrupted when your 10s timeout
> reaches. And if 'add_money' is in its 'finally' statement - you simply postpone
> its interruption, meaning that 'redis_unlock' will end its execution nicely.
>
> Again, I'm not sure how exactly you manage your timeouts. The way I am,
> simplified: I have a timeouts heapq with pointers to those coroutines
> that were *explicitly* executed with a timeout. So I'm protecting only
> the coroutines in that queue, because only them can be interrupted. And
> the coroutines they call, are protected *automatically*.
>
> If you do it differently, can you please elaborate on how your scheduler
> is actually designed?
>
I have a global timeout for processing single request. It's actually higher
in a chain of generator calls. So dispatcher looks like:
def dispatcher(self, method, args):
with timeout(10):
yield getattr(self.method)(*args)
And all the local timeouts, like timeout for single request are
usually applied at a socket level, where specific protocol
is implemented:
def redis_unlock(lock):
yield redis_socket.wait_write(2) # wait two seconds
# TimeoutError may have been raised in wait_write()
cmd = ('DEL user:'+lock+'\n').encode('ascii')
redis_socket.write(cmd) # should be loop here, actually
yield redis_socket.wait_read(2) # another two seconds
result = redis_socket.read(1024) # here loop too
assert result == 'OK\n'
So they are not interruptions. Although, we don't use them
much with coroutines, global timeout for request is
usually enough.
But anyway I don't see a reason to protect a single frame,
because even if you have a simple mutex without coroutines
you end up with:
def something():
lock.acquire()
try:
pass
finally:
lock.release()
And if lock's imlementation is something along the lines of:
def release(self):
self._native_lock.release()
How would you be sure that interruption is not executed
when interpreter resolved `self._native_lock.release` but
not yet called it?
> OK, point taken. Please give me couple of days to at least
> come up with a summary document.
No hurry.
> I still don't like your
> solution because it works directly with frames. With an
> upcoming PyPy support of python 3, I don't think I want
> to loose the JIT support.
>
It's also interesting question. I don't think it's possible to interrupt
JIT'ed code in arbitrary location.
> Ideally, as I proposed earlier, we should introduce some
> sort of interruption protocol -- method 'interrupt()', with
> perhaps a callback.
>
On which object? Is it sys.interrupt()? Or is it thread.interrupt()?
>> you want to implement thread interruption, and that's not
>> my point, there is another thread for that.
>
> We have two requests: ability to safely interrupt python
> function or generator (1); ability to safely interrupt
> python's threads (2). Both (1) and (2) share the same
> requirement of safe 'finally' statements. To me, both
> features are similar enough to come up with a single
> solution, rather than inventing different approaches.
>
Again I do not propose described point (1). I propose a
way to *inspect* a stack if it's safe to interrupt.
>> On Wed, Apr 4, 2012 at 3:03 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>>>
>>> I don't think a frame flag on its own is quite enough.
>>> You don't just want to prevent interruptions while in
>>> a finally block, you want to defer them until the finally
>>> counter gets back to zero. Making the interrupter sleep
>>> and try again in that situation is rather ugly.
>
> That's the second reason I don't like your proposal.
>
> def foo():
> try:
> ..
> finally:
> yield unlock()
> # <--- the ideal point to interrupt foo
>
> f = open('a', 'w')
> # what if we interrupt it here?
> try:
> ..
> finally:
> f.close()
>
And which one fixes this problem? There is no guarantee
that your timeout code haven't interrupted
at " # what if we interrupt it here?". If it's a bit less likely,
it's not real solution. Please, don't present it as such.
>>> So perhaps there could also be a callback that gets
>>> invoked when the counter goes down to zero.
>>
>> Do you mean put callback in a frame, which get
>> executed at next bytecode just like signal handler,
>> except it waits until finally clause is executed?
>>
>> I would work, except in may have light performance
>> impact on each bytecode. But I'm not sure if it will
>> be noticeable.
>
> That's essentially the way we currently did it. We transform the
> coroutine's __code__ object to make it from:
>
> def a():
> try:
> # code1
> finally:
> # code2
>
> to:
>
> def a():
> __self__ = __get_current_coroutine()
> try:
> # code1
> finally:
> __self__.enter_finally()
> try:
> # code2
> finally:
> __self__.exit_finally()
>
> 'enter_finally' and 'exit_finally' maintain the internal counter
> of finally blocks. If a coroutine needs to be interrupted, we check
> that counter. If it is 0 - throw in a special exception. If not -
> wait till it becomes 0 and throw the exception in 'exit_finally'.
>
The problem is in interruption of another thread. You must
inspect stack only with C code holding GIL. Implementation
might be more complex, but yes, it's probably can be done,
without noticeable slow down.
--
Paul
More information about the Python-ideas
mailing list