Em qui., 16 de set. de 2021 às 11:49, Fabio Zadrozny <fabiofz@gmail.com> escreveu:
Hi all,

I have a weird case where I thought line events would be issued and yet they aren't even though they're in the instructions in the bytecode (both in 3.9 and 3.10).

i.e.:

Given the code:

def check_backtrack(x):  # line 1
    if not (x == 'a'  # line 2
        or x == 'c'):  # line 3
        pass  # line 4

it has dis.dis such as:

  2           0 LOAD_FAST                0 (x)
              2 LOAD_CONST               1 ('a')
              4 COMPARE_OP               2 (==)
              6 POP_JUMP_IF_TRUE        12 (to 24)

  3           8 LOAD_FAST                0 (x)
             10 LOAD_CONST               2 ('c')
             12 COMPARE_OP               2 (==)

  2          14 POP_JUMP_IF_TRUE        10 (to 20)

  4          16 LOAD_CONST               0 (None)
             18 RETURN_VALUE

  2     >>   20 LOAD_CONST               0 (None)
             22 RETURN_VALUE
        >>   24 LOAD_CONST               0 (None)
             26 RETURN_VALUE

So, by just following the instructions/line numbers, I'd say that when the instruction:

2          14 POP_JUMP_IF_TRUE        10 (to 20)

is executed, a line event would take place, yet, this isn't true, but if that offset is changed to include more instructions then such a line event is issued.

i.e.: something as:

    def tracer(frame, event, arg):
        print(frame, event)
        return tracer

    import sys
    sys.settrace(tracer)
    check_backtrack('f')

prints:

1 call
2 line
3 line
4 line
4 return

when I expected it to print:

1 call
2 line
3 line
2 line |<-- this is not being issued
4 line
4 return

So, I have some questions related to this:

Does anyone know why this happens?
What's the rule to identify this?
Why is that line number assigned to that instruction (i.e.: it seems a bit odd that this is set up like that in the first place)?

Thanks,

Fabio

p.s.: I'm asking because in a debugger which changes bytecode I want to keep the same semantics and it appears that if I add more bytecode at that instruction offset, those semantics aren't kept (but I don't really know what are the semantics to keep here since it seems like that instruction should issue a line event even though it doesn't).

Answering my own question after investigating:

It boils down to the way that ceval.c does prediction of bytecodes which makes it miss the line.

i.e.: the compare is something as:

TARGET(COMPARE_OP): {
assert(oparg <= Py_GE);
PyObject *right = POP();
PyObject *left = TOP();
PyObject *res = PyObject_RichCompare(left, right, oparg);
SET_TOP(res);
Py_DECREF(left);
Py_DECREF(right);
if (res == NULL)
goto error;
PREDICT(POP_JUMP_IF_FALSE);
PREDICT(POP_JUMP_IF_TRUE);
DISPATCH();
}

Given that, PREDICT makes the "POP_JUMP_IF_FALSE" /  "POP_JUMP_IF_TRUE" line be ignored and its line becomes merged with the "COMPARE_OP" in the tracing and when the bytecode is manipulated this is no longer true, so those spurious line events start to be generated in the tracing.

So, in the end it boils down the the eval loop not respecting what's written in the bytecode under some conditions -- in this particular case it's probably good as the jump line seems to be off and I'd say it's a bug in the bytecode generation which is then fixed by a bug in the PREDICT ignoring that line that's off, so, the bugs nullify each other, but then, my rewriting of bytecode makes the PREDICT fail and then the issue of having the line off in the bytecode becomes apparent while tracing.


Does someone know if with the introduction of the new optimizations/quickening the PREDICT will still be used in 3.11? If there are 2 interpretation modes now, the PREDICT probably makes it hard to have both do the same things since the eval loop doesn't really match what the bytecode says due to the PREDICT.

Thanks,

Fabio