[Python-ideas] Standard way to get caller name and easy call stack access

Tue Mar 27 15:00:49 CEST 2012

On Tue, Mar 27, 2012 at 9:02 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> On Thu, Mar 22, 2012 at 11:01 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> Trying to find out your calling function is hard because well-designed
>> decoupled code with properly separated concerns *should never care who
>> is calling it*, so improving the APIs for doing so is an *extremely
>> low priority task*.
>
> You're saying that somebody specifically designed this part to be
> hard, because those who program properly should never request the name
> of calling function. Is that what you're wanted to say?

No, I'm saying the existence of frame objects and a call stack in the
first place is inherently an implementation detail and hence it is
impossible to provide full access to it in an implementation
independent fashion. We don't want to make it part of the language
spec, because we want to give implementors freedom to break those APIs
and still claim compliance with the Python language spec (many
implementors try to provide those APIs *anyway*, but they're well
within their rights not to do so).

> Correct me, but I think you were trying to say that nobody is
> interested in improving the way to get caller function, because people
> should not write the code that requests it. There are two counter
> arguments:
> 1. De facto existence of the code in standard library that is used a
> lot and that makes Python programs slow - logging

No, I am merely saying that such code is necessarily implementation
dependent. As far as I am aware, the reason printing tracebacks
(either directly or via logging) slows PyPy down is *because it has to
make sure the stack exists to be printed*, thus forcing PyPy to turn
off a bunch of optimisations.

Similarly, IronPython by default doesn't create a full frame stack -
you have to specify additional options to request creation of either
full or partial frames during execution in order to use modules that
rely on those features.

> 2. Practicality that beats purity. Debugging and introspection
> capabilities of the language are as important for development and
> maintenance as writing well-designed decoupled code with properly
> separated concerns (which mostly exists in ideal world without time,
> readability and performance constraints).
>
> For people who study Python applications programming that goes beyond
> function reference most of the time is spent in reverse coding -
> debugging and troubleshooting, and accessing a call stack without
> unnecessary implications makes a big difference in understanding how
> application (and Python) works.

The call stack is necessarily tightly coupled to the implementation,
and not all implementations will be easily able to introspect their
calling environment (since the ability to do so efficiently depends on
the underlying runtime). Guido made a deliberate choice a long time
ago to exclude that feature from the language requirements by
including the leading underscore in sys._getframe(). That and the
similar underscore on sys._current_frames() are not an accident -
they're there to give implementation authors additional freedom in how
they choose to implement function calls (and that includes the authors
of future versions of CPython).

> As I am not a C expert - I can't immediately see any obvious problems
> with caching call stack information with qualified names on any
> platform. Straightforward idea is to annotate bytecode in a separate
> memory page during compilation.

You are thinking far too much about a single implementation here. The
problem is not implementing call stack access for CPython (or any
implementation that can provide the sys._getframe() API without much
additional effort). The problem is placing that constraint on all
current and future implementations of Python that claim conformance
with the language spec.

> Debugging facilities of the interpreter.. Huh.
> inspect.stack()[2][0].f_locals['self'].__class__.__name__ - is that
> the thing you keep in your head when breaking into Python console to
> get a name of a class for a calling method?
> And what should be invoked if a caller is not a method?

No, I expect people to use pdb or just read the tracebacks printed
when an exception is thrown. Runtime introspection of local variables
without the aid of a debugger or other tool is a seriously advanced
programming technique (and will *always* be highly implementation
dependent).

The following two options can be very useful for exploring problematic
parts of a program:

   import pdb; pdb.set_trace() # Break into pdb in a running program

   import pbd; pdb.pm() # Start pdb after an exception was thrown
(very useful in conjunction with the use of "python -i" to drop into
the interactive prompt instead of terminating)

>> Once again, you're painting with a broad sweeping brush "ah, it's
>> horrible, it's too hard to do anything worthwhile" (despite the
>> obvious contradiction that many projects already implement such things
>> quite effectively) instead of suggesting *small*, *incremental*
>> changes that might actually improve the status quo.
>
> Well, I proposed to have a caller_name() method. If it is not too
> universal, then a sys.call_stack() could be a better try:
>
>  sys.call_stack() - Get current call stack. Return the list with a
>                            list of qualified names starting with the oldest.
>
> Rationale: make logging calls and module based filters zero cost.

As others pointed out, I did indeed miss the concrete suggestions you
made. The idea of making qualname an attribute of the code object
rather than just the function certainly has merit, although it would
be quite an extensive patch to achieve it (as it would involve changes
to several parts of the code generation and execution chain, from the
AST evaluation step through to the main eval loop and the function and
class object constructors). Now, while 3.3 is still in alpha, is
definitely the right time for such a patch to be put forward (it would
be awkward, although possibly still feasible, to make such a change in
a backwards compatible way for 3.4 if the current implementation is
released as is for 3.3). There are other potentially beneficial use
cases for the new qualname attribute as well (e.g. the suggestion of
using it in tracebacks instead of __name__ is a good idea)

As far as the call_stack() or caller_name() idea goes, it would
definitely be significantly less restrictive than requiring that
implementations expose the full frame machinery that CPython uses.
However, the other implementations (especially PyPy) would need to be
consulted regarding the possible performance implications. Use cases
would also need to be investigated to ensure that just the qualified
name is sufficient. Tracebacks and warnings, for example, require at
least the filename and line number, and other use cases may require
the module name. Any such discussion really needs to be driven by the
implications for Python runtimes that don't natively use CPython-style
execution frames, to ensure providing a rarely used introspection API
doesn't end up slowing down *all* applications on those platforms.
That particular performance concern doesn't affect CPython, since the
de facto API for stack introspection is the one that CPython uses
internally anyway and exposing it as is to users (albeit with the
"here be dragons" implementation dependence warning) is a relatively
cheap exercise.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia