Mailman 3 Obtaining stack-frames from co-routine objects - Python-Dev

newer
Re: [Python-Dev] cpython (merge...

Obtaining stack-frames from co-routine objects

Ben Leslie

28 May 2015 28 May '15

11:46 p.m.

Hi all, Apologies in advance; I'm not a regular, and this may have been handled already (but I couldn't find it when searching). I've been using the new async/await functionality (congrats again to Yury on getting that through!), and I'd like to get a stack trace between the place at which blocking occurs and the outer co-routine. For example, consider this code: """ async def a(): await b() async def b(): await switch() @types.coroutine def switch(): yield coro_a = a() coro_a.send(None) """ At this point I'd really like to be able to somehow get a stack trace similar to: test.py:2 test.py:4 test.py:9 Using the gi_frame attribute of coro_a, I can get the line number of the outer frame (e.g.: line 2), but from there there is no way to descend the stack to reach the actual yield point. I thought that perhaps the switch() co-routine could yield the frame object returned from inspect.currentframe(), however once that function yields that frame object has f_back changed to None. A hypothetical approach would be to work the way down form the outer-frame, but that requires getting access to the co-routine object that the outer-frame is currently await-ing. Some hypothetical code could be: """ def show(coro): print("{}:{}".format(coro.gi_frame.f_code.co_filename, coro.gi_frame.f_lineno)) if dis.opname[coro.gi_code.co_code[coro.gi_frame.f_lasti + 1]] == 'YIELD_FROM': show(coro.gi_frame.f_stack[0]) """ This relies on the fact that an await-ing co-routine will be executing a YIELD_FROM instruction. The above code uses a completely hypothetical 'f_stack' property of frame objects to pull the co-routine object which a co-routine is currently await-ing from the stack. I've implemented a proof-of-concept f_stack property in the frameobject.c just to test out the above code, and it seems to work. With all that, some questions: 1) Does anyone else see value in trying to get the stack-trace down to the actual yield point? 2) Is there a different way of doing it that doesn't require changes to Python internals? 3) Assuming no to #2 is there a better way of getting the information compared to the pretty hacking byte-code/stack inspection? Thanks, Ben

Show replies by date

Yury Selivanov

29 May 29 May

8:57 a.m.

Hi Ben, Is there any real-world scenario where you would need this? It looks like this can help with debugging, somehow, but the easiest solution is to put a "if debug: log(...)" before "yield" in your "switch()" function. You'll have a perfect traceback there. Thanks, Yury On 2015-05-29 12:46 AM, Ben Leslie wrote:

...

Hi all,

Apologies in advance; I'm not a regular, and this may have been handled already (but I couldn't find it when searching).

I've been using the new async/await functionality (congrats again to Yury on getting that through!), and I'd like to get a stack trace between the place at which blocking occurs and the outer co-routine.

For example, consider this code:

""" async def a(): await b()

async def b(): await switch()

@types.coroutine def switch(): yield

coro_a = a() coro_a.send(None) """

At this point I'd really like to be able to somehow get a stack trace similar to:

test.py:2 test.py:4 test.py:9

Using the gi_frame attribute of coro_a, I can get the line number of the outer frame (e.g.: line 2), but from there there is no way to descend the stack to reach the actual yield point.

I thought that perhaps the switch() co-routine could yield the frame object returned from inspect.currentframe(), however once that function yields that frame object has f_back changed to None.

A hypothetical approach would be to work the way down form the outer-frame, but that requires getting access to the co-routine object that the outer-frame is currently await-ing. Some hypothetical code could be:

""" def show(coro): print("{}:{}".format(coro.gi_frame.f_code.co_filename, coro.gi_frame.f_lineno)) if dis.opname[coro.gi_code.co_code[coro.gi_frame.f_lasti + 1]] == 'YIELD_FROM': show(coro.gi_frame.f_stack[0]) """

This relies on the fact that an await-ing co-routine will be executing a YIELD_FROM instruction. The above code uses a completely hypothetical 'f_stack' property of frame objects to pull the co-routine object which a co-routine is currently await-ing from the stack. I've implemented a proof-of-concept f_stack property in the frameobject.c just to test out the above code, and it seems to work.

With all that, some questions:

1) Does anyone else see value in trying to get the stack-trace down to the actual yield point? 2) Is there a different way of doing it that doesn't require changes to Python internals? 3) Assuming no to #2 is there a better way of getting the information compared to the pretty hacking byte-code/stack inspection?

Thanks,

Ben _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com

Ben Leslie

31 May 31 May

7:35 a.m.

Hi Yury, I'm just starting my exploration into using async/await; all my 'real-world' scenarios are currently hypothetical. One such hypothetical scenario however is that if I have a server process running, with some set of concurrent connections, each managed by a co-routine. Each co-routine is of some arbitrary complexity e.g: some combination of reading files, reading from database, reading from peripherals. If I notice one of those co-routines appears stuck and not making progress, I'd very much like to debug that, and preferably in a way that doesn't necessarily stop the rest of the server (or even the co-routine that appears stuck). The problem with the "if debug: log(...)" approach is that you need foreknowledge of the fault state occurring; on a busy server you don't want to just be logging every 'switch()'. I guess you could do something like "switch_state[outer_coro] = get_current_stack_frames()" on each switch. To me double book-keeping something that the interpreter already knows seems somewhat wasteful but maybe it isn't really too bad. Cheers, Ben On 29 May 2015 at 23:57, Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...

Hi Ben,

Is there any real-world scenario where you would need this?

It looks like this can help with debugging, somehow, but the easiest solution is to put a "if debug: log(...)" before "yield" in your "switch()" function. You'll have a perfect traceback there.

Thanks, Yury

On 2015-05-29 12:46 AM, Ben Leslie wrote:

...
Hi all,

Apologies in advance; I'm not a regular, and this may have been handled already (but I couldn't find it when searching).

I've been using the new async/await functionality (congrats again to Yury on getting that through!), and I'd like to get a stack trace between the place at which blocking occurs and the outer co-routine.

For example, consider this code:

""" async def a(): await b()

async def b(): await switch()

@types.coroutine def switch(): yield

coro_a = a() coro_a.send(None) """

At this point I'd really like to be able to somehow get a stack trace similar to:

test.py:2 test.py:4 test.py:9

Using the gi_frame attribute of coro_a, I can get the line number of the outer frame (e.g.: line 2), but from there there is no way to descend the stack to reach the actual yield point.

I thought that perhaps the switch() co-routine could yield the frame object returned from inspect.currentframe(), however once that function yields that frame object has f_back changed to None.

A hypothetical approach would be to work the way down form the outer-frame, but that requires getting access to the co-routine object that the outer-frame is currently await-ing. Some hypothetical code could be:

""" def show(coro): print("{}:{}".format(coro.gi_frame.f_code.co_filename, coro.gi_frame.f_lineno)) if dis.opname[coro.gi_code.co_code[coro.gi_frame.f_lasti + 1]] == 'YIELD_FROM': show(coro.gi_frame.f_stack[0]) """

This relies on the fact that an await-ing co-routine will be executing a YIELD_FROM instruction. The above code uses a completely hypothetical 'f_stack' property of frame objects to pull the co-routine object which a co-routine is currently await-ing from the stack. I've implemented a proof-of-concept f_stack property in the frameobject.c just to test out the above code, and it seems to work.

With all that, some questions:

1) Does anyone else see value in trying to get the stack-trace down to the actual yield point? 2) Is there a different way of doing it that doesn't require changes to Python internals? 3) Assuming no to #2 is there a better way of getting the information compared to the pretty hacking byte-code/stack inspection?

Thanks,

Ben _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/benno%40benno.id.au

Yury Selivanov

1 Jun 1 Jun

11:39 p.m.

Hi Ben, On 2015-05-31 8:35 AM, Ben Leslie wrote:

...

Hi Yury,

I'm just starting my exploration into using async/await; all my 'real-world' scenarios are currently hypothetical.

One such hypothetical scenario however is that if I have a server process running, with some set of concurrent connections, each managed by a co-routine. Each co-routine is of some arbitrary complexity e.g: some combination of reading files, reading from database, reading from peripherals. If I notice one of those co-routines appears stuck and not making progress, I'd very much like to debug that, and preferably in a way that doesn't necessarily stop the rest of the server (or even the co-routine that appears stuck).

The problem with the "if debug: log(...)" approach is that you need foreknowledge of the fault state occurring; on a busy server you don't want to just be logging every 'switch()'. I guess you could do something like "switch_state[outer_coro] = get_current_stack_frames()" on each switch. To me double book-keeping something that the interpreter already knows seems somewhat wasteful but maybe it isn't really too bad.

I guess it all depends on how "switching" is organized in your framework of choice. In asyncio, for instance, all the code that knows about coroutines is in tasks.py. `Task` class is responsible for running coroutines, and it's the single place where you would need to put the "if debug: ..." line for debugging "slow" Futures-- the only thing that coroutines can "stuck" with (the other thing is accidentally calling blocking code, but your proposal wouldn't help with that). Yury

Ben Leslie

12 Jun 12 Jun

1:38 a.m.

On 2 June 2015 at 14:39, Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...

Hi Ben,

On 2015-05-31 8:35 AM, Ben Leslie wrote:

...
Hi Yury,

I'm just starting my exploration into using async/await; all my 'real-world' scenarios are currently hypothetical.

One such hypothetical scenario however is that if I have a server process running, with some set of concurrent connections, each managed by a co-routine. Each co-routine is of some arbitrary complexity e.g: some combination of reading files, reading from database, reading from peripherals. If I notice one of those co-routines appears stuck and not making progress, I'd very much like to debug that, and preferably in a way that doesn't necessarily stop the rest of the server (or even the co-routine that appears stuck).

The problem with the "if debug: log(...)" approach is that you need foreknowledge of the fault state occurring; on a busy server you don't want to just be logging every 'switch()'. I guess you could do something like "switch_state[outer_coro] = get_current_stack_frames()" on each switch. To me double book-keeping something that the interpreter already knows seems somewhat wasteful but maybe it isn't really too bad.

I guess it all depends on how "switching" is organized in your framework of choice. In asyncio, for instance, all the code that knows about coroutines is in tasks.py. `Task` class is responsible for running coroutines, and it's the single place where you would need to put the "if debug: ..." line for debugging "slow" Futures-- the only thing that coroutines can "stuck" with (the other thing is accidentally calling blocking code, but your proposal wouldn't help with that).

I suspect that I haven't properly explained the motivating case. My motivating case is being able to debug a relatively large, complex system. If the system crashes (through an exception), or in some other manner enters an unexpected state (co-routines blocked for too long) it would be very nice to be able to debug an arbitrary co-routine, not necessarily the one indicating a bad system state, to see exactly what it is/was doing at the time the anomalous behavior occurs. So, this is a case of trying to analyse some system wide behavior than necessarily one particular task. So until you start analysing the rest of the system you don't know which co-routines you want to analyse. My motivation for this is primarily avoiding double book-keeping. I assume that the framework has organised things so that there is some data structure to find all the co-routines (or some other object wrapping the co-routines) and that all the "switching" occurs in one place. With that in mind I can have some code that works something like: @coroutine def switch(): coro_stacks[current_coro] = inspect.getouterframes(inspect.currentframe()) .... I think something like this is probably the best approach to achieve my desired goals with currently available APIs. However I really don't like it as it required this double book-keeping. I'm manually retaining this trace back for each coro, which seems like a waste of memory, considering the interpreter already has this information stored, just unexposed. I feel it is desirable, and in-line with existing Python patterns to expose the interpreter data structures, rather than making the user do extra work to access the same information. Cheers, Ben

Guido van Rossum

1:13 p.m.

On Thu, Jun 11, 2015 at 11:38 PM, Ben Leslie <benno@benno.id.au> wrote:

...

On 2 June 2015 at 14:39, Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...
Hi Ben,

On 2015-05-31 8:35 AM, Ben Leslie wrote:

...
Hi Yury,

I'm just starting my exploration into using async/await; all my 'real-world' scenarios are currently hypothetical.

One such hypothetical scenario however is that if I have a server process running, with some set of concurrent connections, each managed by a co-routine. Each co-routine is of some arbitrary complexity e.g: some combination of reading files, reading from database, reading from peripherals. If I notice one of those co-routines appears stuck and not making progress, I'd very much like to debug that, and preferably in a way that doesn't necessarily stop the rest of the server (or even the co-routine that appears stuck).

The problem with the "if debug: log(...)" approach is that you need foreknowledge of the fault state occurring; on a busy server you don't want to just be logging every 'switch()'. I guess you could do something like "switch_state[outer_coro] = get_current_stack_frames()" on each switch. To me double book-keeping something that the interpreter already knows seems somewhat wasteful but maybe it isn't really too bad.

I guess it all depends on how "switching" is organized in your framework of choice. In asyncio, for instance, all the code that knows about coroutines is in tasks.py. `Task` class is responsible for running coroutines, and it's the single place where you would need to put the "if debug: ..." line for debugging "slow" Futures-- the only thing that coroutines can "stuck" with (the other thing is accidentally calling blocking code, but your proposal wouldn't help with that).

I suspect that I haven't properly explained the motivating case.

My motivating case is being able to debug a relatively large, complex system. If the system crashes (through an exception), or in some other manner enters an unexpected state (co-routines blocked for too long) it would be very nice to be able to debug an arbitrary co-routine, not necessarily the one indicating a bad system state, to see exactly what it is/was doing at the time the anomalous behavior occurs.

So, this is a case of trying to analyse some system wide behavior than necessarily one particular task. So until you start analysing the rest of the system you don't know which co-routines you want to analyse.

My motivation for this is primarily avoiding double book-keeping. I assume that the framework has organised things so that there is some data structure to find all the co-routines (or some other object wrapping the co-routines) and that all the "switching" occurs in one place.

With that in mind I can have some code that works something like:

@coroutine def switch(): coro_stacks[current_coro] = inspect.getouterframes(inspect.currentframe()) ....

I think something like this is probably the best approach to achieve my desired goals with currently available APIs.

However I really don't like it as it required this double book-keeping. I'm manually retaining this trace back for each coro, which seems like a waste of memory, considering the interpreter already has this information stored, just unexposed.

I feel it is desirable, and in-line with existing Python patterns to expose the interpreter data structures, rather than making the user do extra work to access the same information.

Ben, I suspect that this final paragraph is actually the crux to your request. You need to understand what the interpreter is doing before you can propose an API to its data structures. The particular thing to understand about coroutines is that a coroutine which is suspended at "yield" or "yield from" has a frame but no stack -- the frame holds the locals and the suspension point, but it is not connected to any other frames. Its f_back pointer is literally NULL. (Perhaps you are more used to threads, where a suspended thread still has a stack.) Moreover, the interpreter has no bookkeeping that keeps track of suspended frames. So I'm not sure exactly what information you think the interpreter has stored but does not expose. Even asyncio/tasks.py does not have this bookkeeping -- it keeps track of Tasks, which are an asyncio-specific class that wraps certain coroutines, but not every coroutine is wrapped by a Task (and this is intentional, as a coroutine is a much more lightweight data structure than a Task instance). IOW I don't think that the problem here is that you haven't sufficiently motivated your use case -- you are asking for information that just isn't available. (Which is actually where you started the thread -- you can get to the frame of the coroutine but there's nowhere to go from that frame.) -- --Guido van Rossum (python.org/~guido)

Nick Coghlan

13 Jun 13 Jun

2:22 a.m.

On 13 June 2015 at 04:13, Guido van Rossum <guido@python.org> wrote:

...

IOW I don't think that the problem here is that you haven't sufficiently motivated your use case -- you are asking for information that just isn't available. (Which is actually where you started the thread -- you can get to the frame of the coroutine but there's nowhere to go from that frame.)

If I'm understanding Ben's request correctly, it isn't really the stack trace that he's interested in (as you say, that specific phrasing doesn't match the way coroutine suspension works), but rather having visibility into the chain of control flow delegation for currently suspended frames: what operation is the outermost frame ultimately blocked *on*, and how did it get to the point of waiting for that operation? At the moment, all of the coroutine and generator-iterator resumption information is implicit in the frame state, so we can't externally introspect the delegation of control flow in a case like Ben's original example (for coroutines) or like this one for generators: def g1(): yield 42 def g2(): yield from g1() g = g2() next(g) # We can tell here that g is suspended # We can't tell that it delegated flow to a g1() instance I wonder if in 3.6 it might be possible to *add* some bookkeeping to "await" and "yield from" expressions that provides external visibility into the underlying iterable or coroutine that the generator-iterator or coroutine has delegated flow control to. As an initial assessment, the runtime cost would be: * an additional pointer added to generator/coroutine objects to track control flow delegation * setting that when suspending in "await" and "yield from" expressions * clearing it when resuming in "await" and "yield from" expressions (This would be a read-only borrowed reference from a Python level perspective, so it shouldn't be necessary to alter the reference count - we'd just be aliasing the existing reference from the frame's internal stack state) Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Guido van Rossum

4:03 a.m.

On Sat, Jun 13, 2015 at 12:22 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On 13 June 2015 at 04:13, Guido van Rossum <guido@python.org> wrote:

...
IOW I don't think that the problem here is that you haven't sufficiently motivated your use case -- you are asking for information that just isn't available. (Which is actually where you started the thread -- you can get to the frame of the coroutine but there's nowhere to go from that frame.)

If I'm understanding Ben's request correctly, it isn't really the stack trace that he's interested in (as you say, that specific phrasing doesn't match the way coroutine suspension works), but rather having visibility into the chain of control flow delegation for currently suspended frames: what operation is the outermost frame ultimately blocked *on*, and how did it get to the point of waiting for that operation?

At the moment, all of the coroutine and generator-iterator resumption information is implicit in the frame state, so we can't externally introspect the delegation of control flow in a case like Ben's original example (for coroutines) or like this one for generators:

def g1(): yield 42

def g2(): yield from g1()

g = g2() next(g) # We can tell here that g is suspended # We can't tell that it delegated flow to a g1() instance

I wonder if in 3.6 it might be possible to *add* some bookkeeping to "await" and "yield from" expressions that provides external visibility into the underlying iterable or coroutine that the generator-iterator or coroutine has delegated flow control to. As an initial assessment, the runtime cost would be:

* an additional pointer added to generator/coroutine objects to track control flow delegation * setting that when suspending in "await" and "yield from" expressions * clearing it when resuming in "await" and "yield from" expressions

(This would be a read-only borrowed reference from a Python level perspective, so it shouldn't be necessary to alter the reference count - we'd just be aliasing the existing reference from the frame's internal stack state)

Ah, this makes sense. I think the object you're after is 'reciever' [sic] in the YIELD_FROM opcode implementation, right? -- --Guido van Rossum (python.org/~guido)

Ben Leslie

5:36 a.m.

On 13 June 2015 at 19:03, Guido van Rossum <guido@python.org> wrote:

...

On Sat, Jun 13, 2015 at 12:22 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...
On 13 June 2015 at 04:13, Guido van Rossum <guido@python.org> wrote:

...
IOW I don't think that the problem here is that you haven't sufficiently motivated your use case -- you are asking for information that just isn't available. (Which is actually where you started the thread -- you can get to the frame of the coroutine but there's nowhere to go from that frame.)

If I'm understanding Ben's request correctly, it isn't really the stack trace that he's interested in (as you say, that specific phrasing doesn't match the way coroutine suspension works), but rather having visibility into the chain of control flow delegation for currently suspended frames: what operation is the outermost frame ultimately blocked *on*, and how did it get to the point of waiting for that operation?

At the moment, all of the coroutine and generator-iterator resumption information is implicit in the frame state, so we can't externally introspect the delegation of control flow in a case like Ben's original example (for coroutines) or like this one for generators:

def g1(): yield 42

def g2(): yield from g1()

g = g2() next(g) # We can tell here that g is suspended # We can't tell that it delegated flow to a g1() instance

I wonder if in 3.6 it might be possible to *add* some bookkeeping to "await" and "yield from" expressions that provides external visibility into the underlying iterable or coroutine that the generator-iterator or coroutine has delegated flow control to. As an initial assessment, the runtime cost would be:

* an additional pointer added to generator/coroutine objects to track control flow delegation * setting that when suspending in "await" and "yield from" expressions * clearing it when resuming in "await" and "yield from" expressions

(This would be a read-only borrowed reference from a Python level perspective, so it shouldn't be necessary to alter the reference count - we'd just be aliasing the existing reference from the frame's internal stack state)

Ah, this makes sense. I think the object you're after is 'reciever' [sic] in the YIELD_FROM opcode implementation, right?

Right this is the exact book-keeping that I was originally referring to in my previous emails (sorry for not making that more explicit earlier). The 'reciever' is actually on the stack part of the co-routine's frame all the time (i.e.: pointed to via f->f_stacktop). So from my point of view the book-keeping is there (albeit somewhat obscurely!), but the objects on the frame's stack aren't exposed via the Python wrapping of PyFrameObject, (although could, I think, easily be exposed). Once you get at the receiver object (which is another co-routine) you can traverse down to the point the co-routine 'switched'. Cheers, Ben

Ben Leslie

5:25 a.m.

On 13 June 2015 at 17:22, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On 13 June 2015 at 04:13, Guido van Rossum <guido@python.org> wrote:

...
IOW I don't think that the problem here is that you haven't sufficiently motivated your use case -- you are asking for information that just isn't available. (Which is actually where you started the thread -- you can get to the frame of the coroutine but there's nowhere to go from that frame.)

If I'm understanding Ben's request correctly, it isn't really the stack trace that he's interested in (as you say, that specific phrasing doesn't match the way coroutine suspension works), but rather having visibility into the chain of control flow delegation for currently suspended frames: what operation is the outermost frame ultimately blocked *on*, and how did it get to the point of waiting for that operation?

At the moment, all of the coroutine and generator-iterator resumption information is implicit in the frame state, so we can't externally introspect the delegation of control flow in a case like Ben's original example (for coroutines) or like this one for generators:

def g1(): yield 42

def g2(): yield from g1()

g = g2() next(g) # We can tell here that g is suspended # We can't tell that it delegated flow to a g1() instance

I wonder if in 3.6 it might be possible to *add* some bookkeeping to "await" and "yield from" expressions that provides external visibility into the underlying iterable or coroutine that the generator-iterator or coroutine has delegated flow control to. As an initial assessment, the runtime cost would be:

* an additional pointer added to generator/coroutine objects to track control flow delegation * setting that when suspending in "await" and "yield from" expressions * clearing it when resuming in "await" and "yield from" expressions

Thanks Nick for rephrasing with the appropriate terminology. I had tried to get it right but with a background of implementing OS kernels, I have a strong habit of existing terminology to break. I agree with your suggestion that explicitly having the pointers is much nicer than my opcode hack implementation. Without side-tracking this discussion I do just want to say that the hypothetical code is something that actually works if frame objects expose the stack, which is possibly as easy as: +static PyObject * +frame_getstack(PyFrameObject *f, void *closure) +{ + PyObject **p; + PyObject *list = PyList_New(0); + + if (list == NULL) + return NULL; + + if (f->f_stacktop != NULL) { + for (p = f->f_valuestack; p < f->f_stacktop; p++) { + /* FIXME: is this the correct error handling condition? */ + if (PyList_Append(list, *p)) { + Py_DECREF(list); + return NULL; + } + } + } + + return list; +} I have implemented and tested this and it worked well (although I really don't know CPython internals well enough to know if the above code doesn't have some serious issues with it). Is there any reason an f_stack attribute is not exposed for frames? Many of the other PyFrameObject values are exposed. I'm guessing that there probably aren't too many places where you can get hold of a frame that doesn't have an empty stack in normal operation, so it probably isn't necessary. Anyway, I'm not suggesting that adding f_stack is better than explicitly adding pointers, but it does seem a more general thing that can be exposed and enable this use case without requiring extra book-keeping data structures. Cheers, Ben

Nick Coghlan

11:21 a.m.

On 13 June 2015 at 20:25, Ben Leslie <benno@benno.id.au> wrote:

...

Is there any reason an f_stack attribute is not exposed for frames? Many of the other PyFrameObject values are exposed. I'm guessing that there probably aren't too many places where you can get hold of a frame that doesn't have an empty stack in normal operation, so it probably isn't necessary.

There's also the fact that anything we do in CPython that assumes a value stack based interpreter implementation might not work on other Python implementations, so we generally try to keep that level of detail hidden.

...

Anyway, I'm not suggesting that adding f_stack is better than explicitly adding pointers, but it does seem a more general thing that can be exposed and enable this use case without requiring extra book-keeping data structures.

Emulating CPython frames on other implementations is already hard enough, without exposing the value stack directly. Compared to "emulate CPython's value stack", "keep track of delegation to subiterators and subcoroutines" is a far more reasonable request to make of developers of other implementations.

...

From a learnability perspective, there's also nothing about an "f_stack" attribute that says "you can use this to find out where a generator or coroutine has delegated control", while attributes like "gi_delegate" or "cr_delegate" would be more self-explanatory.

Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Guido van Rossum

12:34 p.m.

On Sat, Jun 13, 2015 at 9:21 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On 13 June 2015 at 20:25, Ben Leslie <benno@benno.id.au> wrote:

...
Is there any reason an f_stack attribute is not exposed for frames? Many of the other PyFrameObject values are exposed. I'm guessing that there probably aren't too many places where you can get hold of a frame that doesn't have an empty stack in normal operation, so it probably isn't necessary.

There's also the fact that anything we do in CPython that assumes a value stack based interpreter implementation might not work on other Python implementations, so we generally try to keep that level of detail hidden.

...
Anyway, I'm not suggesting that adding f_stack is better than explicitly adding pointers, but it does seem a more general thing that can be exposed and enable this use case without requiring extra book-keeping data structures.

Emulating CPython frames on other implementations is already hard enough, without exposing the value stack directly.

I'm not sure how strong this argument is. We also expose bytecode, which is about as unportable as anything. (Though arguably we can't keep bytecode a secret, because it's written to .pyc files.) I find Ben's implementation pretty straightforward, and because it makes a copy, I don't think there's anything one could do with the exposed stack that could violate any of the interpreter's invariants (I might change it to return a tuple to emphasize this point to the caller though). But I agree it isn't a solution to the question about the suspension "stack".

...

Compared to "emulate CPython's value stack", "keep track of delegation to subiterators and subcoroutines" is a far more reasonable request to make of developers of other implementations.

From a learnability perspective, there's also nothing about an "f_stack" attribute that says "you can use this to find out where a generator or coroutine has delegated control", while attributes like "gi_delegate" or "cr_delegate" would be more self-explanatory.

Stack frame objects are kind of expensive and I would hate to add an extra pointer to every frame just to support this functionality. Perhaps we could have a flag though that says whether the top of the stack is in fact the generator object on which we're waiting in a yield-from? This flag could perhaps sit next to f_executing (also note that this new flag is mutually exclusive with f_executing). We could then easily provide a new method or property on the frame object that returns the desired generator if the flag is set or None if the flag is not set -- other Python implementations could choose to implement this differently. -- --Guido van Rossum (python.org/~guido)

Nick Coghlan

8:05 p.m.

On 14 Jun 2015 03:35, "Guido van Rossum" <guido@python.org> wrote:

...

On Sat, Jun 13, 2015 at 9:21 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...
From a learnability perspective, there's also nothing about an "f_stack" attribute that says "you can use this to find out where a generator or coroutine has delegated control", while attributes like "gi_delegate" or "cr_delegate" would be more self-explanatory.

Stack frame objects are kind of expensive and I would hate to add an

extra pointer to every frame just to support this functionality. Perhaps we could have a flag though that says whether the top of the stack is in fact the generator object on which we're waiting in a yield-from? This flag could perhaps sit next to f_executing (also note that this new flag is mutually exclusive with f_executing). We could then easily provide a new method or property on the frame object that returns the desired generator if the flag is set or None if the flag is not set -- other Python implementations could choose to implement this differently. Fortunately, we can expose this control flow delegation info through the generator-iterator and coroutine object APIs, rather than needing to do it directly on the frame. I'd missed that it could be done without *any* new C level state though - I now think Ben's right that we should be able to just expose the delegation lookup from the resumption logic itself as a calculated property. Cheers, Nick.

Greg Ewing

6:20 p.m.

Nick Coghlan wrote:

...

I wonder if in 3.6 it might be possible to *add* some bookkeeping to "await" and "yield from" expressions that provides external visibility into the underlying iterable or coroutine that the generator-iterator or coroutine has delegated flow control to.

In my original implementation of yield-from, stack frames had an f_yieldfrom slot referring to the object being yielded from. I gather that slot no longer exists at the C level, but it ought to be possible to provide a Python attribute or method returning the same information. -- Greg

Ben Leslie

7 p.m.

On 14 June 2015 at 09:20, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

...

Nick Coghlan wrote:

...
I wonder if in 3.6 it might be possible to *add* some bookkeeping to "await" and "yield from" expressions that provides external visibility into the underlying iterable or coroutine that the generator-iterator or coroutine has delegated flow control to.

In my original implementation of yield-from, stack frames had an f_yieldfrom slot referring to the object being yielded from.

I gather that slot no longer exists at the C level, but it ought to be possible to provide a Python attribute or method returning the same information.

OK, this is really easy to implement actually. I implemented it as gi_yieldfrom on the generator object rather than f_yieldfrom on the frame object, primarily as most of the related code is already in the genobject.c module. Something like this: diff --git a/Objects/genobject.c b/Objects/genobject.c index 3c32e7b..bc42fe5 100644 --- a/Objects/genobject.c +++ b/Objects/genobject.c @@ -553,11 +553,22 @@ gen_set_qualname(PyGenObject *op, PyObject *value) return 0; } +static PyObject * +gen_getyieldfrom(PyGenObject *gen) +{ + PyObject *yf = gen_yf(gen); + if (yf == NULL) + Py_RETURN_NONE; + + return yf; +} + static PyGetSetDef gen_getsetlist[] = { {"__name__", (getter)gen_get_name, (setter)gen_set_name, PyDoc_STR("name of the generator")}, {"__qualname__", (getter)gen_get_qualname, (setter)gen_set_qualname, PyDoc_STR("qualified name of the generator")}, + {"gi_yieldfrom", (getter)gen_getyieldfrom, NULL, NULL}, {NULL} /* Sentinel */ }; (Note: Above probably doesn't exactly follow correct coding conventions, and probably need to add an appropriate doc string.) This greatly simplifies my very original 'show' routine to: def show(coro): print("{}:{} ({})".format(coro.gi_frame.f_code.co_filename, coro.gi_frame.f_lineno, coro)) if coro.gi_yieldfrom: show(coro.gi_yieldfrom) I think this would give the widest flexibility for non-CPython implementations to implement the same property in the most appropriate manner. If this seems like a good approach I'll try and work it in to a suitable patch for contribution. Cheers, Ben

Nick Coghlan

8:16 p.m.

On 14 Jun 2015 10:01, "Ben Leslie" <benno@benno.id.au> wrote:

...

If this seems like a good approach I'll try and work it in to a suitable patch for contribution.

I think it's a good approach, and worth opening an enhancement issue for. I expect any patch would need some adjustments after Yury has finished revising the async/await implementation to address some beta compatibility issues with functools.singledispatch and Tornado. Cheers, Nick.

Guido van Rossum

14 Jun 14 Jun

4:16 a.m.

A good plan. I think this could be added to 3.5 still? It's a pretty minor adjustment to the PEP 492 machinery, really. On Sat, Jun 13, 2015 at 6:16 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On 14 Jun 2015 10:01, "Ben Leslie" <benno@benno.id.au> wrote:

...
If this seems like a good approach I'll try and work it in to a suitable patch for contribution.

I think it's a good approach, and worth opening an enhancement issue for.

I expect any patch would need some adjustments after Yury has finished revising the async/await implementation to address some beta compatibility issues with functools.singledispatch and Tornado.

Cheers, Nick.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org

-- --Guido van Rossum (python.org/~guido)

Ben Leslie

5:50 a.m.

Per Nick's advice I've created enhancement proposal 245340 with an attached patch. On 14 June 2015 at 19:16, Guido van Rossum <guido@python.org> wrote:

...

A good plan. I think this could be added to 3.5 still? It's a pretty minor adjustment to the PEP 492 machinery, really.

On Sat, Jun 13, 2015 at 6:16 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...
On 14 Jun 2015 10:01, "Ben Leslie" <benno@benno.id.au> wrote:

...
If this seems like a good approach I'll try and work it in to a suitable patch for contribution.

I think it's a good approach, and worth opening an enhancement issue for.

I expect any patch would need some adjustments after Yury has finished revising the async/await implementation to address some beta compatibility issues with functools.singledispatch and Tornado.

Cheers, Nick.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org

-- --Guido van Rossum (python.org/~guido)

Mark Lawrence

6:26 a.m.

On 14/06/2015 11:50, Ben Leslie wrote:

...

Per Nick's advice I've created enhancement proposal 245340 with an attached patch.

http://bugs.python.org/issue24450 as opposed to http://bugs.python.org/issue24450#msg245340 :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence

Nick Coghlan

6:20 a.m.

On 14 Jun 2015 19:17, "Guido van Rossum" <guido@python.org> wrote:

...

A good plan. I think this could be added to 3.5 still? It's a pretty

minor adjustment to the PEP 492 machinery, really. Good point - as per Ben's original post, the lack of it makes it quite hard to get a clear picture of the system state when using coroutines in the current beta. Cheers, Nick.

...

On Sat, Jun 13, 2015 at 6:16 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...
On 14 Jun 2015 10:01, "Ben Leslie" <benno@benno.id.au> wrote:

...
If this seems like a good approach I'll try and work it in to a suitable patch for contribution.

I think it's a good approach, and worth opening an enhancement issue for.

I expect any patch would need some adjustments after Yury has finished

revising the async/await implementation to address some beta compatibility issues with functools.singledispatch and Tornado.

...

...
Cheers, Nick.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/guido%40python.org

...
-- --Guido van Rossum (python.org/~guido)

jaivish kothari

13 Jun 13 Jun

5:38 a.m.

Hi , I had a Question,i hope i'll find the solution here. Say i have a Queue.

...

...
...
h = Queue.Queue(maxsize=0) h.put(1) h.put(2) h.empty() False h.join() h.empty() False h.get() 1 h.get() 2 h.get() Blocked.......................

My Question is : In a single threaded environment why does the get() gets blocked , instead of raising an exception.On interpreter i have no way to resume working. And my second question is : Why doe we have to explicitly call task_done after get(). why doesn't get() implicitly call task_done(). as for put() entry for unfinished_task is automatically added .why not get deleted in get() then. Thanks. On Fri, May 29, 2015 at 10:16 AM, Ben Leslie <benno@benno.id.au> wrote:

...

Hi all,

Apologies in advance; I'm not a regular, and this may have been handled already (but I couldn't find it when searching).

I've been using the new async/await functionality (congrats again to Yury on getting that through!), and I'd like to get a stack trace between the place at which blocking occurs and the outer co-routine.

For example, consider this code:

""" async def a(): await b()

async def b(): await switch()

@types.coroutine def switch(): yield

coro_a = a() coro_a.send(None) """

At this point I'd really like to be able to somehow get a stack trace similar to:

test.py:2 test.py:4 test.py:9

Using the gi_frame attribute of coro_a, I can get the line number of the outer frame (e.g.: line 2), but from there there is no way to descend the stack to reach the actual yield point.

I thought that perhaps the switch() co-routine could yield the frame object returned from inspect.currentframe(), however once that function yields that frame object has f_back changed to None.

A hypothetical approach would be to work the way down form the outer-frame, but that requires getting access to the co-routine object that the outer-frame is currently await-ing. Some hypothetical code could be:

""" def show(coro): print("{}:{}".format(coro.gi_frame.f_code.co_filename, coro.gi_frame.f_lineno)) if dis.opname[coro.gi_code.co_code[coro.gi_frame.f_lasti + 1]] == 'YIELD_FROM': show(coro.gi_frame.f_stack[0]) """

This relies on the fact that an await-ing co-routine will be executing a YIELD_FROM instruction. The above code uses a completely hypothetical 'f_stack' property of frame objects to pull the co-routine object which a co-routine is currently await-ing from the stack. I've implemented a proof-of-concept f_stack property in the frameobject.c just to test out the above code, and it seems to work.

With all that, some questions:

1) Does anyone else see value in trying to get the stack-trace down to the actual yield point? 2) Is there a different way of doing it that doesn't require changes to Python internals? 3) Assuming no to #2 is there a better way of getting the information compared to the pretty hacking byte-code/stack inspection?

Thanks,

Ben _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/jaivishkothari10104733%40...

MRAB

2:01 p.m.

On 2015-06-13 11:38, jaivish kothari wrote:

...

Hi ,

I had a Question,i hope i'll find the solution here.

Say i have a Queue.

...
...
...
h = Queue.Queue(maxsize=0) h.put(1) h.put(2) h.empty() False h.join() h.empty() False h.get() 1 h.get() 2 h.get() Blocked.......................

My Question is : In a single threaded environment why does the get() gets blocked , instead of raising an exception.On interpreter i have no way to resume working.

And my second question is : Why doe we have to explicitly call task_done after get(). why doesn't get() implicitly call task_done(). as for put() entry for unfinished_task is automatically added .why not get deleted in get() then.

The way it's used is to get a item, process it, and then signal that it has finished with it. There's going to be a period of time when an item is not in the queue, but is still being processed.

3435

Age (days ago)

3451

Last active (days ago)

List overview

Download

21 comments

8 participants

participants (8)

Ben Leslie
Greg Ewing
Guido van Rossum
jaivish kothari
Mark Lawrence
MRAB
Nick Coghlan
Yury Selivanov

Obtaining stack-frames from co-routine objects

Mark Lawrence

tags

participants (8)