[New-bugs-announce] [issue39537] Change line number table format

Mark Shannon report at bugs.python.org
Mon Feb 3 05:06:34 EST 2020

New submission from Mark Shannon <mark at hotpy.org>:

The current line number table format has two issues that need to be addressed.

1. There is no way to express that a bytecode does not have have a line number. The `END_ASYNC_FOR` bytecode, bytecodes for cleaning up the variable used to store exceptions in exception handles, and a few other cases, are all artificial and should have no line number.

2. It is inefficient to find a line number when tracing.
Currently, whenever the line number changes, the line number table must be re-scanned from the the start.

I propose to fix this by implementing a new line number table. 
Each instruction (currently pair of bytes) would have a one byte line-offset value. An offset of 0 indicates that the instruction has no line number.

In addition to the offset table there would be a table of bytecode-offset, base-line pairs. Following the pairs is the instruction count.
Adding the instruction count at the end means that the table is not just a table of start, line pairs, but also a table of (inclusive) start, line, (exclusive) end triples. This format makes it very easy to scan forwards and backwards.
Because each entry covers up to 255 lines, the table is very small.

The line of the bytecode at `n*2` (instruction `n`) is calculated as:

offset = lnotab[n]
if offset == 0:
    line = -1 # artificial
    line_base = scan_table_to_find(n)
    line = offset + line_base

The new format fixes the two issues listed above.
1. Having no line number is expressed by a 0 in the offset table.
2. Since the offset-base table is made up of absolute values, not relative ones, it can be reliably scanned backwards. It is even possible to use a binary search, although a linear scan will be faster in almost all cases. 

The number format would be larger than the old one. 
However, the code object is composed not only of code, but several tuples of names and constants as well, so increasing the size of the line number has a small effect overall.

components: Interpreter Core
messages: 361277
nosy: Mark.Shannon
priority: normal
severity: normal
status: open
title: Change line number table format
type: performance
versions: Python 3.9

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list