
Hi, sorry about the lengthy email, but you asked for details :) windows is certainly out of scope (not even sure if somebody succeeded compiling libunwind on windows, probably it needs lots of porting). Some technical details how it now works now (did not finish it yet completely): In the PyPy world: * _U_dyn_register exposed by libunwind is used to save every piece of assembler generated by the JIT (_U_dyn_cancel is called if a loop token is finally collected), otherwise one cannot reliably do the matching between JIT trace <-> native libunwind symbol * The stack PyPy maintains for it's frames is still maintained as before Did you do that back then for PyPy as well? Stack walking: Depending on vmprof.enable(native=True/False) either: 1) native=True, the C stack is walked matching a special symbol __vmprof_eval_vmprof to the entries of 'kind' VMPROF_CODE_TAG. All other symbols exposed by pypy-c or libpypy-c.so are *ignored*. Most of them are internal functions during the interpreter loop. For PyPy this is not entirely true, because we include large parts of the standard library within libpypy-c.so, whereas cpython separates them in other shared objects. THis simply means you cannot log stack frames within those shared objects/executable. Using the facility in libunwind for dynamic code (_U_dyn_register), one can match the JIT frames with 'kind' VMPROF_JITTED_TAG. All other symbols are considered native and are logged as kind VMPROF_NATIVE_TAG. 2) native=False, witch means the stack is walked was it was before (iterating the list of pypy stack frames). Which means the current setup allows you to use either method what ever you prefer. Some notes about the properties:
* everything works without libunwind, native=True raises an exception * with libunwind, we don't loose frames in python just because libunwind is unable to reconstruct the stack
Yes I agree, though I would like to have that in the same pypy-c executable, meaning that native=True does not raise, but simply fulfills the second property.
* we don't pay 10x storage just because there is an option to want native frames
As described above, the filtering of libpypy-c.so/pypy-c internal symbols greatly reduces the size of the resulting traces. If you gdb from time to time 500 stack entries is nothing for cpython :), the filtering did a good job to make that much smaller (did some tests deep down in a pytest stack frame, 300 stack frames became 37 frames. All but ~5 of them where python level stack frames).
On linux I was getting ~7% of stack that was not correctly rebuilt. This is not an issue if you assume that the 7% is statistically distributed evenly, but I heavily doubt that is the case (and there is no way to check) which made us build a more robust approach.
We could check that, by logging 'trace was canceled' and saving a timestamp/counter for each signal. If a certain pattern occured you could tell the user: 'this profile has skipped lots of signals in a very short period, please run with native=False' Can you elaborate on the issue on PyPy + Mac OS X? I remember that you said that there are issues with the JIT code map. Which ones? Cheers, Richard