On 03Dec2019 0815, Mark Shannon wrote:
Hi Everyone,
I am proposing a new PEP, still in draft form, to impose a limit of one million on various aspects of Python programs, such as the lines of code per module.
I assume you're aiming for acceptance in just under four months? :)
Any thoughts or feedback?
It's actually not an unreasonable idea, to be fair. Picking an arbitrary limit less than 2**32 is certainly safer for many reasons, and very unlikely to impact real usage. We already have some real limits well below 10**6 (such as if/else depth and recursion limits). That said, I don't really want to impact edge-case usage, and I'm all too familiar with other examples of arbitrary limits (no file system would need a path longer than 260 characters, right? :o) ). Some comments on the specific items, assuming we're not just going to reject this out of hand.
Specification =============
This PR proposes that the following language features and runtime values be limited to one million.
* The number of source code lines in a module
This one feels the most arbitrary. What if I have a million blank lines or comments? We still need the correct line number to be stored, which means our lineno fields still have to go beyond 10**6. Limiting total lines in a module to 10**6 is certainly too small.
* The number of bytecode instructions in a code object.
Seems reasonable.
* The sum of local variables and stack usage for a code object.
I suspect our effective limit is already lower than 10**6 here anyway - do we know what it actually is?
* The number of distinct names in a code object
SGTM.
* The number of constants in a code object.
SGTM.
* The number of classes in a running interpreter.
I'm a little hesitant on this one, but perhaps there's a way to use a sentinel for class_id (in your later struct) for when someone exceeds this limit? The benefits seem worthwhile here even without the rest of the PEP.
* The number of live coroutines in a running interpreter.
SGTM. At this point we're probably putting serious pressure on kernel wait objects/FDs anyway, and if you're not waiting then you're probably not efficiently using coroutines anyway.
Having 20 bit operands (21 bits for relative branches) allows instructions to fit into 32 bits without needing additional ``EXTENDED_ARG`` instructions. This improves dispatch, as the operand is strictly local to the instruction. Using super-instructions would make that the 32 bit format almost as compact as the 16 bit format, and significantly faster.
We can measure this - how common are EXTENDED_ARG instructions? ISTR we checked this when switching to 16-bit instructions and it was worth it, but I'm not sure whether we also considered 32-bit instructions at that time.
Total number of classes in a running interpreter ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This limit has to the potential to reduce the size of object headers considerably.
This would be awesome, and I *think* it's ABI compatible (as the affected fields are all put behind the PyObject* that gets returned, right?). If so, I think it's worth calling that out in the text, as it's not immediately obvious. Cheers, Steve