[Python-Dev] Simplify lnotab? (AST branch update)
Phillip J. Eby
pje at telecommunity.com
Fri Oct 14 03:55:20 CEST 2005
At 02:25 PM 10/14/2005 +1300, Greg Ewing wrote:
>Phillip J. Eby wrote:
>
> > +1. I'd be especially interested in lifting the current requirement
> > that line ranges and byte ranges both increase monotonically. Even
> > better if the lines for a particular piece of code don't have to all
> > come from the same file.
>
>How about an array of:
>
> +----------------+----------------+----------------+
> | bytecode index | file no. | line no. |
> +----------------+----------------+----------------+
>
>Entries are sorted by bytecode index, with each entry
>applying from that bytecode position up to the position
>of the next entry. The file no. indexes a tuple of file
>names attached to the code object. All entries are 32-bit
>integers.
The file number could be 16-bit - I don't see a use case for referring to
65,000 different filenames. ;) But that doesn't save much space.
Anyway, in the common case, this scheme will use 10 more bytes per line of
Python code, which translates to a megabyte or so for the standard
library. I definitely like the simplicity, but a meg's a meg. A more
compact scheme is possible, by using two tables - a bytecode->line number
table, and a line number-> file table. In the single-file case, you can
omit the second table, and the first table then only uses 6 more bytes per
line than we're currently using. Not fantastic, but probably more acceptable.
If you have to encode multiple files, you just offset their line numbers by
the size of the other files, and put entries in the line->file table to
match. When computing the line number, you subtract the matching entry in
the line->file table to get the actual line number within that file.
More information about the Python-Dev
mailing list