Assuming a code base of 50M loc, *and* that all the code would be loaded
into a single application (I sincerely hope that isn't the case) *and*
that each class is only 100 lines, even then there would only be 500,000
classes.
If a single application has 500k classes, I don't think that a limit of
1M classes would be its biggest problem :)
Hi Guido,
On 04/12/2019 3:51 pm, Guido van Rossum wrote:
> I am overwhelmed by this thread (and a few other things in real life)
> but here are some thoughts.
>
> 1. It seems the PEP doesn't sufficiently show that there is a problem to
> be solved. There are claims of inefficiency but these aren't
> substantiated and I kind of doubt that e.g. representing line numbers in
> 32 bits rather than 20 bits is a problem.
Fundamentally this is not about the immediate performance gains, but
about the potential gains from not having to support huge, vaguely
defined limits that are never needed in practice.
Regarding line numbers,
decoding the line number table for exception tracebacks, profiling and
debugging is expensive and the cost is linear in the size of the code
object. So, the performance benefit would be largest for the code that
is nearest to the limits.
>
> 2. I have handled complaints in the past about existing (accidental)
> limits that caused problems for generated code. People occasionally
> generate *really* wacky code (IIRC the most recent case was a team that
> was generating Python code from machine learning models they had
> developed using other software) and as long as it works I don't want to
> limit such applications.
The key word here is "occasionally". How much do we want to increase the
costs of every Python user for the very rare code generator that might
bump into a limit?
>
> 3. Is it easy to work around a limit? Even if it is, it may be a huge
> pain. I've heard of a limit of 65,000 methods in Java on Android, and my
> understanding was that it was actually a huge pain for both the
> toolchain maintainers and app developers (IIRC the toolchain had special
> tricks to work around it, but those required app developers to change
> their workflow). Yes, 65,000 is a lot smaller than a million, but in a
> different context the same concern applies.
64k *methods* is much, much less than 1M *classes*. At 6 methods per
class, it is 100 times less.
The largest Python code bases, that I am aware of, are at JP Morgan,
with something like 36M LOC and Bank of America with a similar number.
Assuming a code base of 50M loc, *and* that all the code would be loaded
into a single application (I sincerely hope that isn't the case) *and*
that each class is only 100 lines, even then there would only be 500,000
classes.
If a single application has 500k classes, I don't think that a limit of
1M classes would be its biggest problem :)
>
> 4. What does Python currently do if you approach or exceed one of these
> limits? I tried a simple experiment, eval(str(list(range(2000000)))),
> and this completes in a few seconds, even though the source code is a
> single 16 Mbyte-long line.
You can have lines as long as you like :)
>
> 5. On the other hand, the current parser cannot handle more than 100
> nested parentheses, and I've not heard complaints about this. I suspect
> the number of nested indent levels is similarly constrained by the
> parser. The default function call recursion limit is set to 1000 and
> bumping it significantly risks segfaults. So clearly some limits exist
> and are apparently acceptable.
>
> 6. In Linux and other UNIX-y systems, there are many per-process or
> per-user limits, and they can be tuned -- the user (using sudo) can
> change many of those limits, the sysadmin can change the defaults within
> some range, and sometimes the kernel can be recompiled with different
> absolute limits (not an option for most users or even sysadmins). These
> limits are also quite varied -- the maximum number of open file
> descriptors is different than the maximum pipe buffer size. This is of
> course as it should be -- the limits exist to protect the OS and other
> users/processes from runaway code and intentional attacks on resources.
> (And yet, fork bombs exist, and it's easy to fill up a filesystem...) I
> take from this that limits are useful, may have to be overridable, and
> should have values that make sense given the resource they guard.
Being able to dynamically *reduce* a limit from one million seems like a
good idea.
>
> --
> --Guido van Rossum (python.org/~guido <http://python.org/~guido>)
> /Pronouns: he/him //(why is my pronoun here?)/
> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Z4QO3SJDKXMWOP5H5XBFSPSIEFH6BJBS/
Code of Conduct: http://python.org/psf/codeofconduct/