Thanks a lot Gregory for the comments! An additional cost to this is things that parse text tracebacks not knowing
how to handle it and things that log tracebacks generating additional output.
We should provide a way for people to disable the feature on a process as
part of this while they address tooling and logging issues. (via the usual set of command line flag + python env var + runtime API)
Absolutely! We were thinking about that and that's easy enough as that is a single conditional on the display function + the extra init configuration. Neither of those is large. While I'd lean towards uint8_t instead of
uint16_t because not even humans can understand a 255 character line so why bother being pretty about such a thing... Just document the caveat and move on with the lower value. A future pyc format could change it if a compelling argument were ever found.
I very much agree with you here but is worth noting that I have heard the counter-argument that the longer the line is, the more important may be to distinguish what part of the line is wrong. A compromise if you want to handle longer lines: A single uint16_t.
Represent the start column in the 9 bits and width in the other 7 bits. (or any variations thereof) it's all a matter of what tradeoff you want to make for space reasons. encoding as start + width instead of start + end is likely better anyways if you care about compression as the width byte will usually be small and thus be friendlier to compression. I'd personally ignore compression entirely.
I would personally prefer not to implement very tricky compression algorithms because tools may need to parse this and I don't want to complicate the logic a lot. Handling lnotab is already a bit painful and when bugs ocur it makes debugging very tricky. Having the possibility to index something based on the index of the instruction is quite a good API in my opinion. Overall doing this is going to be a big win for developer productivity! Thanks! We think that this has a lot of potential indeed! :) Pablo