[Cython] line tracing/profiling code objects
Syam Gadde
syam.gadde at duke.edu
Thu Mar 27 19:10:17 CET 2014
Stefan,
I'm sorry to be so long in responding. Your response was very helpful,
but I had to suddenly change my focus to other projects. I've been
looking at it again.
I took the easy way out for now and attempted to name the codeobj
objects deterministically, as you suggested at the end of your message:
https://github.com/SyamGadde/cython/commit/b56135154f546eec34d88342dc1698cf1803492e
The assumption is that only one codeobj is ever generated for each named
function within a module. I'm afraid I may have taken a hammer to the
code instead of a chisel, so your comments are welcome.
This is actually sufficient for using the kernprof/line_profiler
decorator @profile as-is on Cython-generated non-cdef functions/methods
as long as you are using the following cython directives:
--directive profile=true
--directive linetrace=true
--directive binding=true
The binding=true is necessary so that the function objects get the
func_code attribute set to the codeobj above. Actually line_profiler
just uses func_code as a unique identifier for the function and doesn't
really look at the contents of func_code except to determine if a
function is a generator. If there is another attribute available for
uniquely identifying function objects that don't have func_code, then
line_profiler could potentially use that too and it could work on cdef-
and cpdef-declared functions...
...except for the fact that cdef and cpdef functions don't accept
arbitrary decorators, which is the typical way to use
kernprof/line_profiler. But I'm pretty happy to be able to use
kernprof/line_profiler on compiled pure-Python functions for now.
Thanks for your help!
-syam
On 02/27/2014 04:06 PM, Stefan Behnel wrote:
> Hi!
>
> Syam Gadde, 27.02.2014 16:22:
>> I tried using line tracing in conjunction with line_profiler/kernprof to
>> attempt to see if line-based profiling works in Cython. I think it's
>> pretty close. But I think the problem is that line_profiler is getting
>> a different PyCodeObject when it wraps the function and when it actually
>> gets the trace call. It adds the initial code object to a map, and
>> later when it gets the trace call, decides that the trace call is not
>> something it needs to pay attention to because the new code object that
>> it gets is not the same as the original one.
>>
>> The first code object is created at function declaration by
>> __Pyx_PyCode_New (called in__Pyx_InitCachedConstants) and is assigned to
>> a variable __pyx_k_codeobj_NNN. The second code object is created,
>> essentially, during the first entry into the function (in
>> __Pyx_TraceCall, via __Pyx_TraceSetupAndCall). It seems that setting
>> __pyx_frame_code to the initial code object before calling TraceCall()
>> would fix this.
> That's a part of it, yes. Here's another bit in the puzzle:
>
> https://github.com/cython/cython/pull/93
>
> The problem is that the code objects that we currently create along the way
> have a fixed Python code line number (because that was the simplest way to
> get it working). The two changes above will get us pretty far. What remains
> to be done then is to enable line number calculation at runtime by
> emulating byte code offsets in the frame object instead of using absolute
> line numbers. The pull request refers to CPython's
> Objects/lnotab_notes.txt, which has some details on the inner workings.
>
> Properly calculating line numbers at runtime would also have other nice
> side effects, because it would remove the need to create multiple code
> objects for a single function in the first place. So it's very desirable.
>
>
>> Is this easy to do? I'd do it myself, but I'd need help figuring out
>> how to get the name of the appropriate __pyx_k_codeobj_NNN variable from
>> within FuncDefNode.generate_function_definitions(), which calls
>> put_trace_call().
> The best way is usually to run it through a debugger and see what you can
> get your hands on. :) The constant names (__pyx_k_...) are generated on the
> fly at C code generation time, so you can't get them before that. Two
> possible solutions: either change the way how the C names of code objects
> are being generated (e.g. by making their C name depend on the mangled C
> name of the function so that they can be deterministically named at
> analysis time), or make the code object node remember its generated
> constant name and reuse it in other places.
>
> As you can see, it's not entirely trivial overall, but we've already been
> piling up the bits and pieces so that the only remaining effort is to put
> everything together. If you could give it a try, I'm sure you'd make a lot
> of people happy with it.
>
> Stefan
>
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> https://mail.python.org/mailman/listinfo/cython-devel
More information about the cython-devel
mailing list