Hi all, Right now, when a debugger is active, the number of local variables can affect the tracing speed quite a lot. For instance, having tracing setup in a program such as the one below takes 4.64 seconds to run, yet, changing all the variables to have the same name -- i.e.: change all assignments to `a = 1` (such that there's only a single variable in the namespace), it takes 1.47 seconds (in my machine)... the higher the number of variables, the slower the tracing becomes. ``` import time t = time.time() def call(): a = 1 b = 1 c = 1 d = 1 e = 1 f = 1 def noop(frame, event, arg): return noop import sys sys.settrace(noop) for i in range(1_000_000): call() print('%.2fs' % (time.time() - t,)) ``` This happens because `PyFrame_FastToLocalsWithError` and `PyFrame_LocalsToFast` are called inside the `call_trampoline` ( https://github.com/python/cpython/blob/master/Python/sysmodule.c#L946). So, I'd like to simply remove those calls. Debuggers can call `PyFrame_LocalsToFast` when needed -- otherwise mutating non-current frames doesn't work anyways. As a note, pydevd already has such a call: https://github.com/fabioz/PyDev.Debugger/blob/0d4d210f01a1c0a8647178b2e665b5... and PyPy also has a counterpart. As for `PyFrame_FastToLocalsWithError`, I don't really see any reason to call it at all. i.e.: something as the code below prints the `a` variable from the `main()` frame regardless of that and I checked all pydevd tests and nothing seems to be affected (it seems that accessing f_locals already does this: https://github.com/python/cpython/blob/cb9879b948a19c9434316f8ab6aba9c4601a8..., so, I don't see much reason to call it at all). ``` def call(): import sys frame = sys._getframe() print(frame.f_back.f_locals) def main(): a = 1 call() if __name__ == '__main__': main() ``` Does anyone see any issue with this? If it's non controversial, is a PEP needed or just an issue to track it would be enough to remove those 2 lines? Thanks, Fabio