Hi Matthieu, 

The dis output for this function in 3.12 is the same as it is in 3.11.

The pseudo-instructions are emitted by the compiler's codegen stage, but never make it to compiled bytecode. They are removed or replaced by real opcodes before the code object is created.

The recent change to the dis module that you mentioned did not change how the disassembly of bytecode gets displayed. Rather, it added the pseudo-instructions to the opcodes list so that we have access to their mnemonics from python. This is a step towards exposing intermediate compilation steps to python (for unit tests, etc).  BTW - part of this will require writing some test utilities for cpython that let us specify and compare opcode sequences, similar to what you have in bytecode.

As for deconstructing the exception table and planting the pseudo instructions back into the code - it would be nice if dis could do that, but we may need to settle for an approximation because I'm not sure the exact block structure can be reliably reconstructed from the exception table at the moment. I may be wrong.

Having a SETUP_*/POP_BLOCK for each line in the exception table is not going to be correct - there can be nested try-except blocks, for instance, and even without them the compiler can emit the code of an except block in non-contiguous order (in https://github.com/python/cpython/pull/93622 I fixed one of those cases to reduce the size of the exception table, but it wasn't a correctness bug).

Irit

On Tue, Jul 5, 2022 at 9:27 AM Matthieu Dartiailh <m.dartiailh@gmail.com> wrote:
Hi all,

I am the current maintainer of bytecode (https://github.com/MatthieuDartiailh/bytecode) which is a library to perform assembly and disassembly of Python bytecode. The library was created by V. Stinner.

I started looking in Python 3.11 support in bytecode, I read Objects/exception_handling_notes.txt and I have a couple of questions regarding the exception table:

Currently bytecode exposes three level of abstractions:
  - the concrete level in which one deals with instruction offset for jumps and explicit indexing into the known constants and names
  - the bytecode level which uses labels for jumps and allow non integer argument to instructions
  - the cfg level which provides basic blocks delineation over the bytecode level

So my first idea was to directly expose the unpacked exception table (start, stop, target, stack_depth, last_i) at the concrete level and use pseudo-instruction and labels at the bytecode level. At this point of my reflections, I saw https://github.com/python/cpython/commit/c57aad777afc6c0b382981ee9e4bc94c03bf5f68 about adding pseudo-instructionto dis output in 3.12 and though it would line up quite nicely. Reading through, I got curious about how SETUP_WITH handled popping one extra item from the stack so I went to look at dis results on a couple of small examples. I tried on 3.10 and 3.11b3 (for some reasons I cannot compile main at a391b74d on windows).

I looked at simple things and got a bit surprised:

Disassembling:

def f():
    try:
        a = 1
    except:
        raise

I get on 3.11:
 1           0 RESUME                   0

  2           2 NOP

  3           4 LOAD_CONST               1 (1)
              6 STORE_FAST               0 (a)
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
        >>   12 PUSH_EXC_INFO

  4          14 POP_TOP

  5          16 RAISE_VARARGS            0
        >>   18 COPY                     3
             20 POP_EXCEPT
             22 RERAISE                  1
ExceptionTable:
  4 to 6 -> 12 [0]
  12 to 16 -> 18 [1] lasti

On 3.10:
  2           0 SETUP_FINALLY            5 (to 12)

  3           2 LOAD_CONST               1 (1)
              4 STORE_FAST               0 (a)
              6 POP_BLOCK
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE

  4     >>   12 POP_TOP
             14 POP_TOP
             16 POP_TOP

  5          18 RAISE_VARARGS            0

This surprised me on two levels:
- first I have never seen the RESUME opcode and it is currently not documented
- my second surprise comes from the second entry in the exception table. At first I failed to see why it was needed but writing this I realize it corresponds to the explicit handling of exception propagation to the caller. Since I cannot compile 3.12 ATM I am wondering how this plays with pseudo-instruction: in particular are pseudo-instructions generated for all entries in the exception table ?

My initial idea was to have a SETUP_FINALLY/SETUP_CLEANUP - POP_BLOCK pair for each line in the exception table and label for the jump target. But I realize it means we will have many such pairs than in 3.10. It is fine by me but I wondered what choice was made in 3.12 dis and if this approach made sense.

Best regards

Matthieu
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XZ7KDCI3TXEUERU3YIFKC543GAGIYG6Q/
Code of Conduct: http://python.org/psf/codeofconduct/