[Python-Dev] Obtaining stack-frames from co-routine objects

Sat Jun 13 12:25:59 CEST 2015

On 13 June 2015 at 17:22, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 13 June 2015 at 04:13, Guido van Rossum <guido at python.org> wrote:
>> IOW I don't think that the problem here is that you haven't sufficiently
>> motivated your use case -- you are asking for information that just isn't
>> available. (Which is actually where you started the thread -- you can get to
>> the frame of the coroutine but there's nowhere to go from that frame.)
>
> If I'm understanding Ben's request correctly, it isn't really the
> stack trace that he's interested in (as you say, that specific
> phrasing doesn't match the way coroutine suspension works), but rather
> having visibility into the chain of control flow delegation for
> currently suspended frames: what operation is the outermost frame
> ultimately blocked *on*, and how did it get to the point of waiting
> for that operation?
>
> At the moment, all of the coroutine and generator-iterator resumption
> information is implicit in the frame state, so we can't externally
> introspect the delegation of control flow in a case like Ben's
> original example (for coroutines) or like this one for generators:
>
>     def g1():
>         yield 42
>
>     def g2():
>         yield from g1()
>
>     g = g2()
>     next(g)
>     # We can tell here that g is suspended
>     # We can't tell that it delegated flow to a g1() instance
>
> I wonder if in 3.6 it might be possible to *add* some bookkeeping to
> "await" and "yield from" expressions that provides external visibility
> into the underlying iterable or coroutine that the generator-iterator
> or coroutine has delegated flow control to. As an initial assessment,
> the runtime cost would be:
>
> * an additional pointer added to generator/coroutine objects to track
> control flow delegation
> * setting that when suspending in "await" and "yield from" expressions
> * clearing it when resuming in "await" and "yield from" expressions

Thanks Nick for rephrasing with the appropriate terminology. I had tried
to get it right but with a background of implementing OS kernels, I have
a strong habit of existing terminology to break.

I agree with your suggestion that explicitly having the pointers is much
nicer than my opcode hack implementation.

Without side-tracking this discussion I do just want to say that the
hypothetical code is something that actually works if frame objects
expose the stack, which is possibly as easy as:

+static PyObject *
+frame_getstack(PyFrameObject *f, void *closure)
+{
+    PyObject **p;
+    PyObject *list = PyList_New(0);
+
+    if (list == NULL)
+        return NULL;
+
+    if (f->f_stacktop != NULL) {
+        for (p = f->f_valuestack; p < f->f_stacktop; p++) {
+            /* FIXME: is this the correct error handling condition? */
+            if (PyList_Append(list, *p)) {
+                Py_DECREF(list);
+                return NULL;
+            }
+        }
+    }
+
+    return list;
+}

I have implemented and tested this and it worked well (although I really
don't know CPython internals well enough to know if the above code doesn't
have some serious issues with it).

Is there any reason an f_stack attribute is not exposed for frames? Many of the
other PyFrameObject values are exposed. I'm guessing that there probably
aren't too many places where you can get hold of a frame that doesn't have an
empty stack in normal operation, so it probably isn't necessary.

Anyway, I'm not suggesting that adding f_stack is better than explicitly
adding pointers, but it does seem a more general thing that can be exposed
and enable this use case without requiring extra book-keeping data structures.

Cheers,

Ben