[Python-Dev] Obtaining stack-frames from co-routine objects

Ben Leslie benno at benno.id.au
Fri Jun 12 08:38:03 CEST 2015

On 2 June 2015 at 14:39, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> Hi Ben,
> On 2015-05-31 8:35 AM, Ben Leslie wrote:
>> Hi Yury,
>> I'm just starting my exploration into using async/await; all my
>> 'real-world' scenarios are currently hypothetical.
>> One such hypothetical scenario however is that if I have a server
>> process running, with some set of concurrent connections, each managed
>> by a co-routine. Each co-routine is of some arbitrary complexity e.g:
>> some combination of reading files, reading from database, reading from
>> peripherals. If I notice one of those co-routines appears stuck and
>> not making progress, I'd very much like to debug that, and preferably
>> in a way that doesn't necessarily stop the rest of the server (or even
>> the co-routine that appears stuck).
>> The problem with the "if debug: log(...)" approach is that you need
>> foreknowledge of the fault state occurring; on a busy server you don't
>> want to just be logging every 'switch()'. I guess you could do
>> something like "switch_state[outer_coro] = get_current_stack_frames()"
>> on each switch. To me double book-keeping something that the
>> interpreter already knows seems somewhat wasteful but maybe it isn't
>> really too bad.
> I guess it all depends on how "switching" is organized in your
> framework of choice.  In asyncio, for instance, all the code that
> knows about coroutines is in tasks.py.  `Task` class is responsible
> for running coroutines, and it's the single place where you would
> need to put the "if debug: ..." line for debugging "slow" Futures--
> the only thing that coroutines can "stuck" with (the other thing
> is accidentally calling blocking code, but your proposal wouldn't
> help with that).

I suspect that I haven't properly explained the motivating case.

My motivating case is being able to debug a relatively large, complex
system. If the system crashes (through an exception), or in some other
manner enters an unexpected state (co-routines blocked for too long)
it would be very nice to be able to debug an arbitrary co-routine, not
necessarily the one indicating a bad system state, to see exactly what
it is/was doing at the time the anomalous  behavior occurs.

So, this is a case of trying to analyse some system wide behavior
than necessarily one particular task. So until you start
analysing the rest of the system you don't know which co-routines
you want to analyse.

My motivation for this is primarily avoiding double book-keeping. I
assume that the framework has organised things so that there is
some data structure to find all the co-routines (or some other object
wrapping the co-routines) and that all the "switching" occurs in
one place.

With that in mind I can have some code that works something like:

def switch():
    coro_stacks[current_coro] = inspect.getouterframes(inspect.currentframe())

I think something like this is probably the best approach to achieve
my desired goals with currently available APIs.

However I really don't like it as it required this double book-keeping. I'm
manually retaining this trace back for each coro, which seems like a
waste of memory, considering the interpreter already has this information
stored, just unexposed.

I feel it is desirable, and in-line with existing Python patterns to expose
the interpreter data structures, rather than making the user do extra work
to access the same information.



