[New-bugs-announce] [issue42551] Generator `yield`s counted as primitive calls by cProfile

Matthew Suozzo report at bugs.python.org
Wed Dec 2 23:29:39 EST 2020

New submission from Matthew Suozzo <matthew.suozzo at gmail.com>:

# Issue

When profiling a generator function, the initial call and all subsequent yields are aggregated into the same "ncalls" metric by cProfile.

## Example

>>> cProfile.run("""
... def foo():
...   yield 1
...   yield 2
... assert tuple(foo()) == (1, 2)
... """)
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        3    0.000    0.000    0.000    0.000 <string>:2(foo)

This was unexpected behavior since it *looks* like a single call from the code. This also complicates basic analysis about the frequency of a codepath's execution where a generator might yield a variable number of times depending on the input.

The profile interface can and does differentiate between call types: Normal calls and "primitive" calls, i.e. those that are not induced via recursion, are displayed as "all_calls/primitive_calls" e.g. "3/1" for a single initial calls with 2 recursed calls (comparable to the example above).

This seems like an appropriate abstraction to apply in the generator case: Each yield is better modeled as an 'interior' call to the generator, not as a call on its own.

# Possible fix

I have two ideas that seem like they might address the problem:

* Add a new PyTrace_YIELD constant (and handle it in [3])
* Track some state from the `frame->f_gen` in the ProfilerEntry (injected in [3], used in [4]) to determine whether this is the first or a subsequent call.

I've not been poking around for long so I don't have an intuition about which would be less invasive (nor really whether either would work in practice).

As background, this seems to be the call chain from trace invocation to callcount increment:
[0]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Python/ceval.c#L4106-L4107
[1]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Python/ceval.c#L4937
[2]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Python/ceval.c#L4961
[3]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Modules/_lsprof.c#L419
[4]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Modules/_lsprof.c#L389
[5]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Modules/_lsprof.c#L311-L316

components: C API
messages: 382373
nosy: matthew.suozzo
priority: normal
severity: normal
status: open
title: Generator `yield`s counted as primitive calls by cProfile
type: behavior
versions: Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list