Function arguments in tracebacks

Consider the following similar C and Python code and their tracebacks: C ------- int divide(int x, int y, char* some_string) { return x / y; } int main(...) { divide(2, 0, "Hello World"); } ------- Program received signal SIGFPE, Arithmetic exception. (gdb) bt #0 0x00000000004004c4 in divide (x=2, y=0, some_string=0x4005a8 "Hello World") at test.c:2 #1 0x00000000004004e7 in main (argc=1, argv=0x7fffffffe328) at test.c:6 Python ------- def divide(x, y, some_string): return x / y divide(2, 0, "Hello World") ------- Traceback (most recent call last): File "test.py", line 4, in <module> File "test.py", line 2, in divide ZeroDivisionError: division by zero By including the function arguments within the traceback, we can get more information at a glance than we could with just the names of methods. This would be pretty cool and stop the occasional "printf" debugging without cluttering up the traceback too much. There will definitely need to be some reasonable line length limit because the repr() of parameters could be really long. In similar situations gdb replaces the value in the traceback with elipsis, and I believe that's a good solution for python as well. Obviously this isn't a great example since the error is immediately obvious but I think this could be potentially useful in a bunch of situations. I've made a a quick toy implementation in traceback.c, this is what it looks like for the script above. Traceback (most recent call last): File "test.py", line 4, in <module> divide(2, 0, "Hello World") File "test.py", line 2, in divide (x=2, y=0, some_string='Hello World') return x / y ZeroDivisionError: division by zero == Potential Downsides == There's probably a lot more than these, but I could only think of these so far. * Private data might be leaked, imagine a def login(username, password): ... method. While function names/source files/source code are also private, variables can potentially contain all kinds of sensitive data. * A variable that takes a long time to return a string representation may significantly slow down the time it takes to generate a traceback. * We can really only return the state of the variables when the traceback is printed, this might result in some slightly un-intuitive behavior. (Easier to explain with an example) def f(x): x = 2 raise Exception() f(1) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in f(x=2) The fact that x is mutated within the function body means that the value printed in the traceback is the changed value which might be slightly misleading. I'd love to hear your guy's thoughts on the idea.

My quick on-vacation response is that attaching more objects to exceptions is typically viewed as dangerous as it can lead to those objects being kept alive longer than expected (see the discussions about richer error messages to see that worry come out for something as simple as attaching the type to a TypeError). On Tue, 27 Dec 2016 at 09:26 Ammar Askar <ammar@ammaraskar.com> wrote:

I think an argument could be made for including the str() of parameters of primitive types and with small values (for some value of "primitive" and "small", can of worms here...). I'm thinking numbers and short strings. Maybe a flag to control this behaviour? My gut feeling is that this would be a hack with lots of corner cases and surprises so it would probably not be very helpful in the general case.

On Dec 28, 2016 12:44, "Brett Cannon" <brett@python.org> wrote: My quick on-vacation response is that attaching more objects to exceptions is typically viewed as dangerous as it can lead to those objects being kept alive longer than expected (see the discussions about richer error messages to see that worry come out for something as simple as attaching the type to a TypeError). This isn't an issue for printing arguments or other locals in tracebacks, though. The traceback printing code can access anything in the frame stack. -n

On Wed, Dec 28, 2016 at 2:13 PM, Nathaniel Smith <njs@pobox.com> wrote:
Right. I'd actually be more worried about security leaks than memory leaks. Imagine you're calling a password checking function that got bytes instead of text, what amounts to a type check could leak the plaintext password. One rarely sees a C traceback, let alone a textual one, except during development, whereas Python tracebacks are seen during development and after deployment. Mahmoud https://github.com/mahmoud

On 29 December 2016 at 08:13, Nathaniel Smith <njs@pobox.com> wrote:
Right, the reasons for the discrepancy here are purely pragmatic ones: - the default traceback printing machinery in CPython is written in C, and we don't currently have readily available tools at that layer to print a nice structured argument list the way gdb does for C functions (and there are good reasons for us to want the interpreter to be able to print tracebacks even if it's in a sufficiently unhealthy state that the "traceback" module won't run, so delegating the problem to Python level tooling isn't an answer for CPython) - displaying local variables in runtime tracebacks (as opposed to in interactive debuggers like gdb) is a known security risk that we don't currently provide good tools for handling in the standard library (e.g. we don't offer str and bytes subclasses with opaque representations that don't reveal their contents). Even if we did offer them, they'd still be opt-in for reasons of usability when working with data that *isn't* security sensitive. However, neither of those arguments applies to the "where" command in pdb, and that doesn't currently display this kind of information either: >>> def f(x, y, message): ... return x/y, message ... >>> f(2, 0, "Hello world") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in f ZeroDivisionError: division by zero >>> import pdb; pdb.pm() > <stdin>(2)f() (Pdb) w <stdin>(1)<module>()->None > <stdin>(2)f() (Pdb) pdb already knows what the arguments are, as it can print them if you ask for them explicitly: (Pdb) args x = 2 y = 0 message = 'Hello world' So I think this kind of change may make a lot of sense as an RFE for pdb's "where" command (with the added bonus that projects like pdbpp could make it available to earlier Python versions as well). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

My quick on-vacation response is that attaching more objects to exceptions is typically viewed as dangerous as it can lead to those objects being kept alive longer than expected (see the discussions about richer error messages to see that worry come out for something as simple as attaching the type to a TypeError). On Tue, 27 Dec 2016 at 09:26 Ammar Askar <ammar@ammaraskar.com> wrote:

I think an argument could be made for including the str() of parameters of primitive types and with small values (for some value of "primitive" and "small", can of worms here...). I'm thinking numbers and short strings. Maybe a flag to control this behaviour? My gut feeling is that this would be a hack with lots of corner cases and surprises so it would probably not be very helpful in the general case.

On Dec 28, 2016 12:44, "Brett Cannon" <brett@python.org> wrote: My quick on-vacation response is that attaching more objects to exceptions is typically viewed as dangerous as it can lead to those objects being kept alive longer than expected (see the discussions about richer error messages to see that worry come out for something as simple as attaching the type to a TypeError). This isn't an issue for printing arguments or other locals in tracebacks, though. The traceback printing code can access anything in the frame stack. -n

On Wed, Dec 28, 2016 at 2:13 PM, Nathaniel Smith <njs@pobox.com> wrote:
Right. I'd actually be more worried about security leaks than memory leaks. Imagine you're calling a password checking function that got bytes instead of text, what amounts to a type check could leak the plaintext password. One rarely sees a C traceback, let alone a textual one, except during development, whereas Python tracebacks are seen during development and after deployment. Mahmoud https://github.com/mahmoud

On 29 December 2016 at 08:13, Nathaniel Smith <njs@pobox.com> wrote:
Right, the reasons for the discrepancy here are purely pragmatic ones: - the default traceback printing machinery in CPython is written in C, and we don't currently have readily available tools at that layer to print a nice structured argument list the way gdb does for C functions (and there are good reasons for us to want the interpreter to be able to print tracebacks even if it's in a sufficiently unhealthy state that the "traceback" module won't run, so delegating the problem to Python level tooling isn't an answer for CPython) - displaying local variables in runtime tracebacks (as opposed to in interactive debuggers like gdb) is a known security risk that we don't currently provide good tools for handling in the standard library (e.g. we don't offer str and bytes subclasses with opaque representations that don't reveal their contents). Even if we did offer them, they'd still be opt-in for reasons of usability when working with data that *isn't* security sensitive. However, neither of those arguments applies to the "where" command in pdb, and that doesn't currently display this kind of information either: >>> def f(x, y, message): ... return x/y, message ... >>> f(2, 0, "Hello world") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in f ZeroDivisionError: division by zero >>> import pdb; pdb.pm() > <stdin>(2)f() (Pdb) w <stdin>(1)<module>()->None > <stdin>(2)f() (Pdb) pdb already knows what the arguments are, as it can print them if you ask for them explicitly: (Pdb) args x = 2 y = 0 message = 'Hello world' So I think this kind of change may make a lot of sense as an RFE for pdb's "where" command (with the added bonus that projects like pdbpp could make it available to earlier Python versions as well). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (7)
-
Ammar Askar
-
Brett Cannon
-
Emanuel Landeholm
-
Mahmoud Hashemi
-
MRAB
-
Nathaniel Smith
-
Nick Coghlan