On Fri, Mar 5, 2021 at 11:11 AM Brett Cannon <brett@python.org> wrote:
Speaking for myself ...

Ditto ...

On Fri, Mar 5, 2021 at 7:04 AM Mark Shannon <mark@hotpy.org> wrote:
[...]
In some cases, the PEP would have improved the situation.

For example:
sys.setrecursionlimit(5000)
def f():
     f()

Currently, it raises a RecursionError on linux, but crashes the
interpreter on Windows.
With PEP 651 it would have raised a RecursionError on both platforms.

Am I missing something here?

So your example shows a user already comfortable in raising their recursion limit to work around needing more stack space to reach completion. What is stopping the user from continuing to raise the limit until they still reach their memory limit even with PEP 651? If you're worried about runaway recursion you will very likely hit that with the default stack depth already, so I personally don't see how a decoupled stack counter from the C stack specifically makes it any easier/better to detect runaway recursion. And if I need more recursion than the default, you're going to bump the recursion depth anyway, which weakens the protection in either the C or decoupled counter scenarios. Sure, it's currently platform-specific, but plenty of people want to push that limit based on their machine anyway and don't need consistency on platforms they will never run on, i.e. I don't see a huge benefit to being able to say that an algorithm consistently won't go past 5000 calls on all platforms compared to what the C stack protection already gives us (not to say there's zero benefit, but it isn't massive or widespread either IMO). I personally just don't see many people saying, "I really want to limit my program to an exact call stack depth of 5000 on all platforms which is beyond the default, but anything under is fine and anything over -- regardless of what the system can actually handle -- is unacceptable".

Tack on the amount of changes required to give a cross-platform stack count and limit check compared to the benefit being proposed, and to me that pushes what the PEP is proposing into net-negative payoff.

To me, the point of that example is as a reminder that currently fiddling with the recursion limit can cause segfaults.

Mark's PEP proposes two, somewhat independent, changes: (1) don't consume C stack on pure Python-to-Python (pp2p) function calls; (2) implement fool-proof C stack overflow checks.

Change (2) makes it safe for users to mess with the stack overflow limit however they see fit. Despite (1), the limit for pp2p calls remains at 1000 so that users who unintentionally write some naively recursive code don't have to wait until they fill up all of memory before they get a traceback. (Of course they could also write a while-True loop that keeps adding an item to a list and they'd end up in the same situation. But in my experience that situation is less painful to deal with than accidental stack overflow, and I'd shudder at the thought of a traceback of a million lines.)

Given that we have (1), why is (2) still needed? Because there are ways to recursively call Python code that aren't pp2p calls. By a pp2p (pure Python-to-Python) call, I mean any direct call, e.g. a method call or a function call. But what about other operations that can invoke Python code? E.g. if we have a dict d and a class C, we could create an instance of C and use it to index d, e.g. d[C()]. This operation is not a p2pp call -- the BINARY_SUBSCR opcode calls the dict's `__getitem__` method, and that calls the key's `__hash__` method. Here's a silly demonstration:
```
def f(c):
    d = {}
    return d[c]

class C:
    def __hash__(self):
        return f(self)

f(C())
```
Note that the "traceback compression" implemented for simple recursive calls fails here -- I just ran this code and got 2000 lines of output.

The way I imagine Mark wants to implement pp2p calls means that in this case each recursion step *does* add several other C stack frames, and this would be caught by the limit implemented in (2). I see no easy way around this -- after all the C code involved in the recursion could be a piece of 3rd party C code that itself is not at fault.

So we could choose to implement only (2), robust C stack overflow checks. This would require a bunch of platform-specific code, and there might be platforms where we don't know how to implement this (I vaguely recall a UNIX version where main() received a growable stack but each thread only had a fixed 64kb stack), but those would be no worse off than today.

Or we could choose to implement only (1), eliminating C stack usage for pp2p calls. But in that case we'd still need a recursion limit for non-pp2p calls. We could eliminate the recursion limit entirely for pp2p calls, but then we'd probably end up with user complaints -- I could imagine a framework that executes arbitrary user code and attempts recovery after receiving an exception (for example a REPL). Such a framework can never be 100% foolproof (e.g. `while True: l.append(1)`, or `os._exit(0)`), but accidental recursion would be a reasonable thing to attempt to curtail, so we'd still need to implement a limit on pp2p calls. Since the new pp2p implementation allows for a much larger recursion limit, Mark's design with two separate limits and exceptions makes sense -- "recursion" to limit pp2p call recursion, and "stack overflow" to limit the C stack overflow. To make things more reliable, it makes sense to throw in (2), better checks for C stack overflow, but that could be done independently -- we could just implement the two separate limits and exceptions and C-stack-frame-free pp2p calls without (2), and we'd be no worse off than today with regard to the safety of "wild" recursion like my example above.

--
--Guido van Rossum (python.org/~guido)