[Python-Dev] Re: PEP proposal to limit various aspects of a Python program to one million.

4 Dec 2019

      On Wed, 4 Dec 2019 at 05:41, Chris Angelico <rosuav@gmail.com> wrote:
...
On Wed, Dec 4, 2019 at 3:16 PM Steven D'Aprano <steve@pearwood.info> wrote:
...
On Wed, Dec 04, 2019 at 01:47:53PM +1100, Chris Angelico wrote:
...
Integer sizes are a classic example of this. Is it acceptable to limit
your integers to 2^16? 2^32? 2^64? Python made the choice to NOT limit
its integers, and I haven't heard of any non-toy examples where an
attacker causes you to evaluate 2**2**100 and eats up all your RAM.
Does self-inflicted attacks count? I've managed to bring down a
production machine, causing data loss, *twice* by thoughtlessly running
something like 10**100**100 at the interactive interpreter. (Neither
case was a server, just a desktop machine, but the data loss was still
very real.)
Hmm, and you couldn't Ctrl-C it? I tried and was able to.
I don't know if this is OS-dependent but I think that maybe there has
been an improvement in recent CPython (3.8?) for using Ctrl-C in these
cases. Certainly in the past I've seen situations where creating an
absurdly large integer cannot be interrupted before it is too late and
the system needs a hard reboot.

This is actually a common source of bugs in SymPy e.g.:
https://github.com/sympy/sympy/issues/17609#issuecomment-531327039

Those bugs in SymPy can be fixed in SymPy which is uniquely in a
position to be able to represent large exponent operations without
actually evaluating them in dense integer format. I would have thought
though that on the spectrum of Python usage SymPy would be very much
at the end that really wants to use enormous integers so the fact that
it needs to be limited there makes me wonder who does really want to
evaluate them.

Note that CPython's implementation of large integers is not as
optimised as gmp so I think that if someone was using Python for
incredibly large integer calculations then they would be well advised
not to use plain int in their calculations anyway (SymPy will try to
use gmpy/gmpy2 if available).
...
There ARE a few situations where I'd rather get a simple and clean
MemoryError than have it drive my system into the swapper, but there
are at least as many situations where you'd rather be able to use
virtual memory instead of being forced to manually break a job up. But
even there, you can't enshrine a limit in the language definition,
since the actual threshold depends on the running system. (And can be
far better enforced externally, at least on a Unix-like OS.)
Another possibility is to have a configurable limit like the recursion
limit so that users can increase it when they want to. The default
limit can be something larger than most people would ever want but
small enough that on typical hardware you can't bork the system in a
single arithmetic operation. Then the default level and
configurability of the limit can be implementation defined.

--
Oscar

[Python-Dev] Re: PEP proposal to limit various aspects of a Python program to one million.

Oscar Benjamin