On Mon, 9 Dec 2019 at 14:10, Mark Shannon <mark@hotpy.org> wrote:
On 07/12/2019 7:37 pm, Oscar Benjamin wrote:
On Sat, 7 Dec 2019 at 06:29, Steven D'Aprano <steve@pearwood.info> wrote:
A million seems reasonable for lines of source code, if we're prepared to tell people using machine generated code to split their humongous .py files into multiple scripts. A small imposition on a small subset of Python users, for the benefit of all. I'm okay with that.
I recently hit on a situation that created a one million line code file: https://github.com/pytest-dev/pytest/issues/4406#issuecomment-439629715
The original file (which is included in SymPy) has 3000 lines averaging 500 characters per line so that the total file is 1.5MB. Since it is a test file pytest rewrites the corresponding pyc file and adds extra lines to annotate the intermediate results in the large expressions. The pytest-rewritten code has just over a million lines.
There are two possible solutions here (in the context of PEP 611)
1. Split the original SymPy test file into two or more files and the test function into many smaller functions.
In this particular situation I think that it isn't necessary for the file to be an imported .py file. It could be a newline delimited text file that is read by the test suite rather than imported. However if we are using this to consider the PEP then note that the file has orders of magnitude fewer lines than the one million limit proposed.
2. Up the line limit to two million and the bytecode limit to many million.
It sounds like a bytecode limit of one million is a lot more restrictive than a one million limit on lines.
Note that changing pytest to output fewer lines won't work as we will just hit the bytecode limit instead.
I'm not sure. I think that pytest should have some kind of limit on what it produces in this situation. The rewriting is just an optimistic attempt to produce more detailed information in the test failure traceback. There's no reason it can't just be disabled if it happens to produce overly long output. I think that was briefly discussed that point in the pytest issue but there isn't a clear answer for how to define the limits. With the PEP it could have been a little clearer e.g. something like "definitely don't produce more than a million lines". In that sense these limits can be useful for people doing code generation. -- Oscar