GDB not breaking at the right place

I'm having a hard time debugging some virtual machine code because GDB won't break where it's supposed to. Here's my breakpoint #2: 2 breakpoint keep y 0x00005555556914fd ceval_reg.h:_PyEval_EvalFrameDefault:TARGET_JUMP_IF_FALSE_REG breakpoint already hit 1 time p/x oparg p (oparg >> 16) & 0xff | (oparg >> 8) & 0xff p oparg & 0xff p *fastlocals@4 but when it breaks, it's not at the beginning of the case (that is, where the TARGET_JUMP_IF_FALSE_REG label is defined), but inside the SETLOCAL macro of the COMPAR_OP_REG case! (That is, it's not anywhere close to the correct place.) case TARGET(COMPARE_OP_REG): { int dst = REGARG4(oparg); int src1 = REGARG3(oparg); int src2 = REGARG2(oparg); int cmpop = REGARG1(oparg); assert(cmpop <= Py_GE); PyObject *left = GETLOCAL(src1); PyObject *right = GETLOCAL(src2); PyObject *res = PyObject_RichCompare(left, right, cmpop); *SETLOCAL(dst, res);* if (res == NULL) goto error; DISPATCH(); } It actually breaks in the Py_XDECREF which is part of the SETLOCAL macro: #define SETLOCAL(i, value) do { PyObject *tmp = GETLOCAL(i); \ GETLOCAL(i) = value; \ *Py_XDECREF(tmp)*; } while (0) (actually, in the Py_DECREF underneath the Py_XDECREF macro). I've configured like so: ./configure --with-pydebug --with-tracerefs --with-assertions Python/ceval.c is compiled with this GCC command: gcc -pthread -c -Wno-unused-result -Wsign-compare -g -Og -Wall -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Wstrict-prototypes -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal -I. -I./Include -DPy_BUILD_CORE -o Python/ceval.o Python/ceval.c I don't know if this is a GCC problem, a GDB problem, or a Skip problem. Is there more I can do to help the tool chain break at the correct place? It seems that if I break at a hard line number, GDB does the right thing, but I'd kind of prefer to use the symbolic label instead. I rather like the notion of breaking at a label name, but if GCC/GDB can't figure things out, I guess I'll have to live with line numbers. Thanks, Skip

I suspect that you're running into the issue where compiler optimizations are *forced* on for ceval.c. There's a comment near the top about this. Just comment out this line: #define PY_LOCAL_AGGRESSIVE We tried to define that macro conditionally, but something broke because the C stack frame for _PyEval_EvalFrameDefault became enormous without optimization, and some tests failed. (Maybe it was Victor's refleak test? The git history will tell you more if you're really interested.) This is a nasty trap (I fell in myself, so that makes it nasty :-), but the proper fix would be convoluted -- we'd need a way to enable or disable this separately so the specific test can run but developers trying to step through ceval.c will be able to see the unoptimized code. On Fri, May 21, 2021 at 12:40 PM Skip Montanaro <skip.montanaro@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Fri, May 21, 2021 at 2:48 PM Guido van Rossum <guido@python.org> wrote:
Thanks, Guido, however that doesn't seem to help. I grepped around for PY_LOCAL_AGGRESSIVE in the source. It seems to be specific to MSVC. Here's the definition in Include/pyport.h with a slight change to the indentation to demonstrate its scope better: #if defined(_MSC_VER) # if defined(PY_LOCAL_AGGRESSIVE) /* enable more aggressive optimization for MSVC */ /* active in both release and debug builds - see bpo-43271 */ # pragma optimize("gt", on) # endif /* ignore warnings if the compiler decides not to inline a function */ # pragma warning(disable: 4710) /* fastest possible local call under MSVC */ # define Py_LOCAL(type) static type __fastcall # define Py_LOCAL_INLINE(type) static __inline type __fastcall #else # define Py_LOCAL(type) static type # define Py_LOCAL_INLINE(type) static inline type #endif I can move the actual point where GDB breaks by replacing -Og with -O0, but it still breaks at the wrong place, just a different wrong place. If I set a breakpoint by line number, it stops at the proper place. Skip

Huh, you're right, I forgot that Py_LOCAL_AGGRESSIVE is specific to MSVC (maybe it wasn't always). I can think of nothing else apart from a gcc or gdb bug. Oh, hm, maybe computed gotos play havoc with the labels??? On Fri, May 21, 2021 at 2:01 PM Skip Montanaro <skip.montanaro@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

I strongly suggest to only build Python with -O0 when using gdb. -Og enables too many optimizations which makes gdb less usable. Victor

I strongly suggest to only build Python with -O0 when using gdb. -Og enables too many optimizations which makes gdb less usable.
Thanks, Victor. It never made sense to me that you would want any optimizations enabled when truly debugging code (as opposed to wanting debug symbols and a sane traceback in production code). I'm getting more convinced that the problem I'm seeing is a GCC/GDB thing, particularly because I can move the erroneous stopping point by changing the GCC optimization level. I'll probably open a bugzilla report just so it's on that team's radar screen. In the meantime, to get going again I wrote a crude script which maps the file:function:label form to file:linenumber form. That way I can save/restore breakpoints across GDB sessions and still avoid problems when the offsets to specific instructions change. Skip Skip

On 2021-05-23 14:56, Skip Montanaro wrote:
When I want to step through the regex module, I turn off optimisation, because any optimisation could move things around or combine things, making single-stepping difficult, and this is with Microsoft Visual Studio. Just turn off optimisation when you want to single-step.

Just turn off optimisation when you want to single-step.
But I don't just want to single-step. I want to break at the target label associated with a specific opcode. (I am - in fits and starts - working on register-based virtual machine instructions). If I'm working on, for example, the register version of POP_JUMP_IF_FALSE, stepping through a bunch of instances of the working register version of LOAD_FAST or EXTENDED_ARG isn't going to be helpful. Further, I have a set of GDB commands I want to execute at each breakpoint. And I want to do this across GDB sessions (so, I save breakpoints and user-defined commands in a GDB command file). Just to make things concrete, here's what I want to print every time I hit my JUMP_IF_FALSE_REG statement's code: define print_opargs_jump p/x oparg p (oparg >> 16) & 0xff | (oparg >> 8) & 0xff p oparg & 0xff p *fastlocals@4 end This break command should do the trick: break ceval_reg.h:_PyEval_EvalFrameDefault:TARGET_JUMP_IF_FALSE_REG commands print_opargs_jump end but it doesn't. GDB stops execution in some completely other one of the 50+ instructions I've implemented so far. And not even at the start of said other instruction. This problem is true whether I compile with -g -Og or -g -O0. The only difference between the two is that GDB stops execution at different incorrect locations. That, as you might imagine, makes debugging difficult. Setting breakpoints by line number works as expected. In all the years I've been using GDB I've never had a problem with that. However, that's fragile in the face of changing offsets for different instructions in the C code (add a new instruction, add or delete C code, reorder instructions for some reason, etc), it's difficult to maintain those kinds of breakpoints. I wrote a crude little script that converts the above break command into this: break ceval_reg.h:440 commands print_opargs_jump end This is just a workaround until someone (unlikely to be me) solves the problem with breaking at labels. If someone could refute or verify my contention that breaking via labels is broken, I'd much appreciate it. I've not yet checked setting labeled breakpoints directly in ceval.c. To minimize merge conflicts, I'm implementing my register instructions in a new header file, Python/ceval_reg.h, which is #included in ceval.c at the desired spot. Maybe that factors into the issue. Skip

"Debugging" means many things. Python is built with -Og because it makes Python faster than -O0, and most developers debug Python code, not C code (in gdb). If you don't need to go up to the gdb/lldb level, -Og is fine. It would even make sense to build Python with -O3 in debug mode if you don't debug C code at all, only pure Python code. My proposition to switch to -00 by default was rejected: https://bugs.python.org/issue38350 I also love -O0 when I modify C code because it makes the build faster ;-) Fedora Python debug builds are now built with -O0 which makes gdb a way more pleasant experience, not more strange behavior with inlined code or "<optimized out>" local variables or function arguments. Victor On Sun, May 23, 2021 at 3:57 PM Skip Montanaro <skip.montanaro@gmail.com> wrote:
-- Night gathers, and now my watch begins. It shall not end until my death.

I'm confused. I've always assumed that --with-pydebug was intended for the situation where you're modifying the C code, so obviously you might have to debug C code. (I know that was the case when we introduced it, decades ago.) If that's not the goal, then what is --py-debug used for? On Mon, May 24, 2021 at 3:35 AM Victor Stinner <vstinner@python.org> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

--with-pydebug default compiler flags is a trade-off between "runtime checks for debug" and performance. -O0 makes Python slower. For example, we want to Python CI to run as fast as possible. I don't want to fight for https://bugs.python.org/issue38350 I simply learnt to type: ./configure --with-pydebug CFLAGS="-O0" (I have a shell alias for that). Victor On Mon, May 24, 2021 at 5:54 PM Guido van Rossum <guido@python.org> wrote:
-- Night gathers, and now my watch begins. It shall not end until my death.

To the contrary, I think if you want the CI jobs to be faster you should add the CFLAGS to the configure call used to run the CI jobs. On Mon, May 24, 2021 at 1:30 PM Victor Stinner <vstinner@python.org> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On 5/24/2021 9:38 PM, Guido van Rossum wrote:
To the contrary, I think if you want the CI jobs to be faster you should add the CFLAGS to the configure call used to run the CI jobs.
Big +1 We should have the most useful interactive development/debugging options set by default (or with an obvious option), and use the complex overrides in our own automated systems. Cheers, Steve

On Tue, May 25, 2021 at 5:38 AM Guido van Rossum <guido@python.org> wrote:
To the contrary, I think if you want the CI jobs to be faster you should add the CFLAGS to the configure call used to run the CI jobs.
-Og makes it faster not only CI jobs, but also everyday "edit code and run `make test` with all assertions" cycles. I don't have opinion which should be default. (+0 for -O0). I use -Og by default and use -O0 only when I need anyway. FWIW, we can disable optimization per-file basis during debugging. // Put this line on files you want to debug. #pragma GCC optimize ("O0") Regards, -- Inada Naoki <songofacandy@gmail.com>

On Tue, May 25, 2021 at 7:49 PM Inada Naoki <songofacandy@gmail.com> wrote:
Agreed, what we do today is already fine. -Og or -O1 are decent options for fast unoptimized builds that lead to increased productivity in the common case. Actually firing up a debugger on CPython's C code is not the common thing for a developer to do. When someone wants to do that, they should build with the relevant compiler for that purpose. ie: Skip should do this. If there is confusion about the meaning of --with-pydebug, that's just a documentation/help-text update to be made. -gps FWIW, we can disable optimization per-file basis during debugging.

On Tue, May 25, 2021 at 7:42 PM Inada Naoki <songofacandy@gmail.com> wrote:
I usually edit and build many times before I'm happy enough with my work to even try "make test" (or the Windows equivalent :-). Which of these options *compiles* the fastest? Maybe that one should be the default. But perhaps other peoples' workflow is different (in fact it would be surprising if we all had the same workflow for writing C code :-). In fact I'm guilty of not even building using --with-pydebug until I run into a problem I can't debug with fprintf(). :-) But once I do, it's likely that I want to aim a debugger at the code, *and* I'd be recompiling repeatedly. The time to complete a full "make test" run on my local machine doesn't bother me, it's too slow to have in my development loop either way. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

me> I'm having a hard time debugging some virtual machine code because GDB won't break where it's supposed to. Here's a quick follow-up. I tried a number of different values of OPT during configuration and compilation, but nothing changed the result. I could never (and still can't) get GDB to break at the line it announces when a breakpoint is set using the TARGET_* labels generated for computed gotos. I also backed away from my dev branch and switched to up-to-date versions of main and 3.10. No difference... So, I opened a bug against GDB: https://sourceware.org/bugzilla/show_bug.cgi?id=27907 and ... wait for it ... The person who responded (Keith Seitz @ RedHat) was unable to reproduce the problem. He encouraged me to build GDB and try again, and with some effort I was able to build an executable (wow, the GDB build process makes building Python look like a piece of cake). Still, the difference between the announced and actual line numbers of the breakpoint remains. I disabled Python support in GDB by renaming my ~/.gdbinit file which declares add-auto-load-safe-path /home/skip/src/python/rvm That had no effect either. I don't have any LD_*_PATH environment variables set. I think I've run out of things to try. I don't recall anyone here indicating they'd tried to replicate the problem. Could I bother someone to give it a whirl? It's easy. Just run GDB referring to a Python executable with computed gotos enabled and debug symbols included. At the (gdb) prompt, execute: b ceval.c:_PyEval_EvalFrameDefault:TARGET_LOAD_CONST run and compare the line number announced when the breakpoint is set with the line number announced when execution stops. On my main branch (updated yesterday), using OPT='-O3 -g -Wall' I get an absolutely bonkers break in the action: % ~/src/binutils-gdb/gdb/gdb ./pythonild.sh GNU gdb (GDB) 11.0.50.20210524-git Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./python... (gdb) b ceval.c:_PyEval_EvalFrameDefault:TARGET_LOAD_CONST *Breakpoint 1 at 0x5e934: file Python/ceval.c, line 1836.* (gdb) r Starting program: /home/skip/src/python/rvm/python warning: the debug information found in "/lib64/ld-2.31.so" does not match "/lib64/ld-linux-x86-64.so.2" (CRC mismatch). [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". *Breakpoint 1, 0x00005555555b2934 in _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2958* 2958 DISPATCH(); LOAD_CONST is, in fact, defined at line ceval.c:1836. Line 2958 is the last line of the implementation of LOAD_NAME, just a few lines away :-/. If I get more detailed with the configure/compile options I can get the difference down to a few lines, but I've yet to see it work correctly. I'm currently offering OPT='-g -O0 -Wall' --with-pydebug --with-trace-refs to the configure script. In most any other program, breaking a few lines ahead of where you wanted would just be an annoyance, but in the Python virtual machine, it makes the breakpoint useless. Skip

I'm having a hard time debugging some virtual machine code because GDB won't break where it's supposed to.
A quick follow-up. The GDB folks were able to reproduce this in an XUbuntu 20.04 VM. I don't know if they tried straight Ubuntu, but as the main difference between the two is the user interface it seems likely the bug might surface there as well. The use of a VM thus provides another option as a workaround for me, though my simple-minded label-to-line number script works as well. Skip

I suspect that you're running into the issue where compiler optimizations are *forced* on for ceval.c. There's a comment near the top about this. Just comment out this line: #define PY_LOCAL_AGGRESSIVE We tried to define that macro conditionally, but something broke because the C stack frame for _PyEval_EvalFrameDefault became enormous without optimization, and some tests failed. (Maybe it was Victor's refleak test? The git history will tell you more if you're really interested.) This is a nasty trap (I fell in myself, so that makes it nasty :-), but the proper fix would be convoluted -- we'd need a way to enable or disable this separately so the specific test can run but developers trying to step through ceval.c will be able to see the unoptimized code. On Fri, May 21, 2021 at 12:40 PM Skip Montanaro <skip.montanaro@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Fri, May 21, 2021 at 2:48 PM Guido van Rossum <guido@python.org> wrote:
Thanks, Guido, however that doesn't seem to help. I grepped around for PY_LOCAL_AGGRESSIVE in the source. It seems to be specific to MSVC. Here's the definition in Include/pyport.h with a slight change to the indentation to demonstrate its scope better: #if defined(_MSC_VER) # if defined(PY_LOCAL_AGGRESSIVE) /* enable more aggressive optimization for MSVC */ /* active in both release and debug builds - see bpo-43271 */ # pragma optimize("gt", on) # endif /* ignore warnings if the compiler decides not to inline a function */ # pragma warning(disable: 4710) /* fastest possible local call under MSVC */ # define Py_LOCAL(type) static type __fastcall # define Py_LOCAL_INLINE(type) static __inline type __fastcall #else # define Py_LOCAL(type) static type # define Py_LOCAL_INLINE(type) static inline type #endif I can move the actual point where GDB breaks by replacing -Og with -O0, but it still breaks at the wrong place, just a different wrong place. If I set a breakpoint by line number, it stops at the proper place. Skip

Huh, you're right, I forgot that Py_LOCAL_AGGRESSIVE is specific to MSVC (maybe it wasn't always). I can think of nothing else apart from a gcc or gdb bug. Oh, hm, maybe computed gotos play havoc with the labels??? On Fri, May 21, 2021 at 2:01 PM Skip Montanaro <skip.montanaro@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

I strongly suggest to only build Python with -O0 when using gdb. -Og enables too many optimizations which makes gdb less usable. Victor

I strongly suggest to only build Python with -O0 when using gdb. -Og enables too many optimizations which makes gdb less usable.
Thanks, Victor. It never made sense to me that you would want any optimizations enabled when truly debugging code (as opposed to wanting debug symbols and a sane traceback in production code). I'm getting more convinced that the problem I'm seeing is a GCC/GDB thing, particularly because I can move the erroneous stopping point by changing the GCC optimization level. I'll probably open a bugzilla report just so it's on that team's radar screen. In the meantime, to get going again I wrote a crude script which maps the file:function:label form to file:linenumber form. That way I can save/restore breakpoints across GDB sessions and still avoid problems when the offsets to specific instructions change. Skip Skip

On 2021-05-23 14:56, Skip Montanaro wrote:
When I want to step through the regex module, I turn off optimisation, because any optimisation could move things around or combine things, making single-stepping difficult, and this is with Microsoft Visual Studio. Just turn off optimisation when you want to single-step.

Just turn off optimisation when you want to single-step.
But I don't just want to single-step. I want to break at the target label associated with a specific opcode. (I am - in fits and starts - working on register-based virtual machine instructions). If I'm working on, for example, the register version of POP_JUMP_IF_FALSE, stepping through a bunch of instances of the working register version of LOAD_FAST or EXTENDED_ARG isn't going to be helpful. Further, I have a set of GDB commands I want to execute at each breakpoint. And I want to do this across GDB sessions (so, I save breakpoints and user-defined commands in a GDB command file). Just to make things concrete, here's what I want to print every time I hit my JUMP_IF_FALSE_REG statement's code: define print_opargs_jump p/x oparg p (oparg >> 16) & 0xff | (oparg >> 8) & 0xff p oparg & 0xff p *fastlocals@4 end This break command should do the trick: break ceval_reg.h:_PyEval_EvalFrameDefault:TARGET_JUMP_IF_FALSE_REG commands print_opargs_jump end but it doesn't. GDB stops execution in some completely other one of the 50+ instructions I've implemented so far. And not even at the start of said other instruction. This problem is true whether I compile with -g -Og or -g -O0. The only difference between the two is that GDB stops execution at different incorrect locations. That, as you might imagine, makes debugging difficult. Setting breakpoints by line number works as expected. In all the years I've been using GDB I've never had a problem with that. However, that's fragile in the face of changing offsets for different instructions in the C code (add a new instruction, add or delete C code, reorder instructions for some reason, etc), it's difficult to maintain those kinds of breakpoints. I wrote a crude little script that converts the above break command into this: break ceval_reg.h:440 commands print_opargs_jump end This is just a workaround until someone (unlikely to be me) solves the problem with breaking at labels. If someone could refute or verify my contention that breaking via labels is broken, I'd much appreciate it. I've not yet checked setting labeled breakpoints directly in ceval.c. To minimize merge conflicts, I'm implementing my register instructions in a new header file, Python/ceval_reg.h, which is #included in ceval.c at the desired spot. Maybe that factors into the issue. Skip

"Debugging" means many things. Python is built with -Og because it makes Python faster than -O0, and most developers debug Python code, not C code (in gdb). If you don't need to go up to the gdb/lldb level, -Og is fine. It would even make sense to build Python with -O3 in debug mode if you don't debug C code at all, only pure Python code. My proposition to switch to -00 by default was rejected: https://bugs.python.org/issue38350 I also love -O0 when I modify C code because it makes the build faster ;-) Fedora Python debug builds are now built with -O0 which makes gdb a way more pleasant experience, not more strange behavior with inlined code or "<optimized out>" local variables or function arguments. Victor On Sun, May 23, 2021 at 3:57 PM Skip Montanaro <skip.montanaro@gmail.com> wrote:
-- Night gathers, and now my watch begins. It shall not end until my death.

I'm confused. I've always assumed that --with-pydebug was intended for the situation where you're modifying the C code, so obviously you might have to debug C code. (I know that was the case when we introduced it, decades ago.) If that's not the goal, then what is --py-debug used for? On Mon, May 24, 2021 at 3:35 AM Victor Stinner <vstinner@python.org> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

--with-pydebug default compiler flags is a trade-off between "runtime checks for debug" and performance. -O0 makes Python slower. For example, we want to Python CI to run as fast as possible. I don't want to fight for https://bugs.python.org/issue38350 I simply learnt to type: ./configure --with-pydebug CFLAGS="-O0" (I have a shell alias for that). Victor On Mon, May 24, 2021 at 5:54 PM Guido van Rossum <guido@python.org> wrote:
-- Night gathers, and now my watch begins. It shall not end until my death.

To the contrary, I think if you want the CI jobs to be faster you should add the CFLAGS to the configure call used to run the CI jobs. On Mon, May 24, 2021 at 1:30 PM Victor Stinner <vstinner@python.org> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On 5/24/2021 9:38 PM, Guido van Rossum wrote:
To the contrary, I think if you want the CI jobs to be faster you should add the CFLAGS to the configure call used to run the CI jobs.
Big +1 We should have the most useful interactive development/debugging options set by default (or with an obvious option), and use the complex overrides in our own automated systems. Cheers, Steve

On Tue, May 25, 2021 at 5:38 AM Guido van Rossum <guido@python.org> wrote:
To the contrary, I think if you want the CI jobs to be faster you should add the CFLAGS to the configure call used to run the CI jobs.
-Og makes it faster not only CI jobs, but also everyday "edit code and run `make test` with all assertions" cycles. I don't have opinion which should be default. (+0 for -O0). I use -Og by default and use -O0 only when I need anyway. FWIW, we can disable optimization per-file basis during debugging. // Put this line on files you want to debug. #pragma GCC optimize ("O0") Regards, -- Inada Naoki <songofacandy@gmail.com>

On Tue, May 25, 2021 at 7:49 PM Inada Naoki <songofacandy@gmail.com> wrote:
Agreed, what we do today is already fine. -Og or -O1 are decent options for fast unoptimized builds that lead to increased productivity in the common case. Actually firing up a debugger on CPython's C code is not the common thing for a developer to do. When someone wants to do that, they should build with the relevant compiler for that purpose. ie: Skip should do this. If there is confusion about the meaning of --with-pydebug, that's just a documentation/help-text update to be made. -gps FWIW, we can disable optimization per-file basis during debugging.

On Tue, May 25, 2021 at 7:42 PM Inada Naoki <songofacandy@gmail.com> wrote:
I usually edit and build many times before I'm happy enough with my work to even try "make test" (or the Windows equivalent :-). Which of these options *compiles* the fastest? Maybe that one should be the default. But perhaps other peoples' workflow is different (in fact it would be surprising if we all had the same workflow for writing C code :-). In fact I'm guilty of not even building using --with-pydebug until I run into a problem I can't debug with fprintf(). :-) But once I do, it's likely that I want to aim a debugger at the code, *and* I'd be recompiling repeatedly. The time to complete a full "make test" run on my local machine doesn't bother me, it's too slow to have in my development loop either way. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

me> I'm having a hard time debugging some virtual machine code because GDB won't break where it's supposed to. Here's a quick follow-up. I tried a number of different values of OPT during configuration and compilation, but nothing changed the result. I could never (and still can't) get GDB to break at the line it announces when a breakpoint is set using the TARGET_* labels generated for computed gotos. I also backed away from my dev branch and switched to up-to-date versions of main and 3.10. No difference... So, I opened a bug against GDB: https://sourceware.org/bugzilla/show_bug.cgi?id=27907 and ... wait for it ... The person who responded (Keith Seitz @ RedHat) was unable to reproduce the problem. He encouraged me to build GDB and try again, and with some effort I was able to build an executable (wow, the GDB build process makes building Python look like a piece of cake). Still, the difference between the announced and actual line numbers of the breakpoint remains. I disabled Python support in GDB by renaming my ~/.gdbinit file which declares add-auto-load-safe-path /home/skip/src/python/rvm That had no effect either. I don't have any LD_*_PATH environment variables set. I think I've run out of things to try. I don't recall anyone here indicating they'd tried to replicate the problem. Could I bother someone to give it a whirl? It's easy. Just run GDB referring to a Python executable with computed gotos enabled and debug symbols included. At the (gdb) prompt, execute: b ceval.c:_PyEval_EvalFrameDefault:TARGET_LOAD_CONST run and compare the line number announced when the breakpoint is set with the line number announced when execution stops. On my main branch (updated yesterday), using OPT='-O3 -g -Wall' I get an absolutely bonkers break in the action: % ~/src/binutils-gdb/gdb/gdb ./pythonild.sh GNU gdb (GDB) 11.0.50.20210524-git Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./python... (gdb) b ceval.c:_PyEval_EvalFrameDefault:TARGET_LOAD_CONST *Breakpoint 1 at 0x5e934: file Python/ceval.c, line 1836.* (gdb) r Starting program: /home/skip/src/python/rvm/python warning: the debug information found in "/lib64/ld-2.31.so" does not match "/lib64/ld-linux-x86-64.so.2" (CRC mismatch). [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". *Breakpoint 1, 0x00005555555b2934 in _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2958* 2958 DISPATCH(); LOAD_CONST is, in fact, defined at line ceval.c:1836. Line 2958 is the last line of the implementation of LOAD_NAME, just a few lines away :-/. If I get more detailed with the configure/compile options I can get the difference down to a few lines, but I've yet to see it work correctly. I'm currently offering OPT='-g -O0 -Wall' --with-pydebug --with-trace-refs to the configure script. In most any other program, breaking a few lines ahead of where you wanted would just be an annoyance, but in the Python virtual machine, it makes the breakpoint useless. Skip

I'm having a hard time debugging some virtual machine code because GDB won't break where it's supposed to.
A quick follow-up. The GDB folks were able to reproduce this in an XUbuntu 20.04 VM. I don't know if they tried straight Ubuntu, but as the main difference between the two is the user interface it seems likely the bug might surface there as well. The use of a VM thus provides another option as a workaround for me, though my simple-minded label-to-line number script works as well. Skip
participants (7)
-
Gregory P. Smith
-
Guido van Rossum
-
Inada Naoki
-
MRAB
-
Skip Montanaro
-
Steve Dower
-
Victor Stinner