Tracebacks for C code in Python

Here's something that's been bugging me for years. I'll suggest something, but since I'm a total newbie about this area, it's possible that everything I'm saying is impossible or doesn't make sense. I'm working with some Pandas code now, and there's an exception because I'm doing something wrong. I get a traceback, but some of the frames are in pyd files (C code I guess?) so I don't see the code for them. This is frustrating, because the exception message isn't that clear, so I would at least like to know what the code was trying to do when it got the exception. Maybe this will give me more hints about what's going wrong. *Would it be possible to have Python tracebacks include code for C code that's called in Python?* I know very little about how the C-to-Python interaction works, and I assume we'd need something complicated like packaging the source code with the binaries in some way that lets Python get the right line of C code to put in the traceback. This can get complicated. Do you think it's possible? Thanks, Ram.

At most, Python would only be able to tell you the Python name of the function it invoked. Traceback for native code is a different can of worms, especially since native code can be distributed without debugging symbols, ESPECIALLY ON WINDOWS. If you ever look at a processed native dump file for a Windows application, all you would see for the most part is memory addresses of the invoked function plus the offset from that address. If you had the debugging symbols, you would get the unmangled? name plus the offset.

You probably wants also the python extension for gdb, to go faster to the interesting code: https://stackoverflow.com/q/41160447/1763602

Ram Rachum schrieb am 15.08.20 um 21:08:
Pandas is actually not implemented in C (or just a bit of that) but in Cython. That is why you get tracebacks that include line numbers from extension modules at all. C implemented extensions do not normally provide this. The reason why the source code lines are not displayed is probably just that Pandas does not ship its source code but only the compiled modules. Remember that the reason why you get Python source code lines in tracebacks is that the Python code file is sitting right there in the installed package. If Pandas did the same thing, you'd probably also get code lines printed in tracebacks. Stefan

Ram, If you install pandas from source, (https://github.com/pandas-dev/pandas#installation-from-sources) then there is a reasonable chance that you will be able to tie the error messages to the source code. If you are finding that the error messages are not clear then I would encourage you to raise ticket(s) and possibly pull requires for those error messages to clarify them – I am reasonably sure that the authors of Pandas would welcome constructive feedback but it does need to be specific of course. Steve Barnes

At most, Python would only be able to tell you the Python name of the function it invoked. Traceback for native code is a different can of worms, especially since native code can be distributed without debugging symbols, ESPECIALLY ON WINDOWS. If you ever look at a processed native dump file for a Windows application, all you would see for the most part is memory addresses of the invoked function plus the offset from that address. If you had the debugging symbols, you would get the unmangled? name plus the offset.

You probably wants also the python extension for gdb, to go faster to the interesting code: https://stackoverflow.com/q/41160447/1763602

Ram Rachum schrieb am 15.08.20 um 21:08:
Pandas is actually not implemented in C (or just a bit of that) but in Cython. That is why you get tracebacks that include line numbers from extension modules at all. C implemented extensions do not normally provide this. The reason why the source code lines are not displayed is probably just that Pandas does not ship its source code but only the compiled modules. Remember that the reason why you get Python source code lines in tracebacks is that the Python code file is sitting right there in the installed package. If Pandas did the same thing, you'd probably also get code lines printed in tracebacks. Stefan

Ram, If you install pandas from source, (https://github.com/pandas-dev/pandas#installation-from-sources) then there is a reasonable chance that you will be able to tie the error messages to the source code. If you are finding that the error messages are not clear then I would encourage you to raise ticket(s) and possibly pull requires for those error messages to clarify them – I am reasonably sure that the authors of Pandas would welcome constructive feedback but it does need to be specific of course. Steve Barnes
participants (6)
-
Barry
-
Marco Sulla
-
Ram Rachum
-
Stefan Behnel
-
Steve Barnes
-
William Pickard