Mailman 3 Idea: Compressing the stack on the fly - Python-ideas

Idea: Compressing the stack on the fly

older
Re: [Python-ideas] Replace numpy...

Ram Rachum

26 May 2013 26 May '13

noon

Hi everybody, Here's an idea I had a while ago. Now, I'm an ignoramus when it comes to how programming languages are implemented, so this idea will most likely be either (a) completely impossible or (b) trivial knowledge. I was thinking about the implementation of the factorial in Python. I was comparing in my mind 2 different solutions: The recursive one, and the one that uses a loop. Here are example implementations for them: def factorial_recursive(n): if n == 1: return 1 return n * factorial_recursive(n - 1) def factorial_loop(n): result = 1 for i in range(1, n + 1): result *= i return result I know that the recursive one is problematic, because it's putting a lot of items on the stack. In fact it's using the stack as if it was a loop variable. The stack wasn't meant to be used like that. Then the question came to me, why? Maybe the stack could be built to handle this kind of (ab)use? I read about tail-call optimization on Wikipedia. If I understand correctly, the gist of it is that the interpreter tries to recognize, on a frame-by-frame basis, which frames could be completely eliminated, and then it eliminates those. Then I read Guido's blog post explaining why he doesn't want it in Python. In that post he outlined 4 different reasons why TCO shouldn't be implemented in Python. But then I thought, maybe you could do something smarter than eliminating individual stack frames. Maybe we could create something that is to the current implementation of the stack what `xrange` is to the old-style `range`. A smart object that allows access to any of a long list of items in it, without actually having to store those items. This would solve the first argument that Guido raises in his post, which I found to be the most substantial one. What I'm saying is: Imagine the stack of the interpreter when it runs the factorial example above for n=1000. It has around 1000 items in it and it's just about to explode. But then, if you'd look at the contents of that stack, you'd see it's embarrassingly regular, a compression algorithm's wet dream. It's just the same code location over and over again, with a different value for `n`. So what I'm suggesting is an algorithm to compress that stack on the fly. An algorithm that would detect regularities in the stack and instead of saving each individual frame, save just the pattern. Then, there wouldn't be any problem with showing informative stack trace: Despite not storing every individual frame, each individual frame could still be *accessed*, similarly to how `xrange` allow access to each individual member without having to store each of them. Then, the stack could store a lot more items, and tasks that currently require recursion (like pickling using the standard library) will be able to handle much deeper recursions. What do you think? Ram.

Attachments:

attachment.htm (text/html — 4.1 KB)

Show replies by date

Terry Jan Reedy

26 May 26 May

8:08 p.m.

On 5/26/2013 8:00 AM, Ram Rachum wrote:

...

def factorial_recursive(n): if n == 1: return 1 return n * factorial_recursive(n - 1)

This fails for 0 (0! == 1 also), but this is easily fixed. If also fails for negative and non-integral values, but ignore that for the moment. def factorial_tail(n, result=1): if n > 1: return factorial_tail(n-1, result * n) else: return result def factorial_while(n, result=1) # written to maximize sameness with tail version while n > 1: n = n=1 result = result * n) else: return result If one can rewrite the 'body' recursion (my term) as tail recursion, the translation to while loop is trivial, and for linear recursion on counts, the for loop version is simple and ofter even clearer as it explicitly identifies all the values used in the coomputation.

...

def factorial_loop(n): result = 1 for i in range(1, n + 1): result *= i return result

Now consider a production ready function that will always terminate: def factorial(n): if n < 0 or n != int(n): raise ValueError('factorial input must be a count') <body as above> To do same with recursion, you have to write main function as a nested function to avoid useless doing check more than once. Conclusion: for linear processing of collections, use a while or for loop.

...

<compress stack>

While I do not see much in the idea. * Converting recursion to iteration compresses stack to 1 frame. * Slows down all computation for rare benefit. * Does not really scale very well. In most languages, factorial overflows long before there is a stack problem. To be realistic, consider linearly processing a billion character string with recursion versus iteration. * Recursion does not work with iterators as well as iteration. Iterators are one of Python's crown jewels. with open('terabyte.txt') as f: for c in chars(f): process(c) is the right idiom for independently processing the characters of a terabyte file. -- Terry Jan Reedy

Serhiy Storchaka

27 May 27 May

9:29 a.m.

26.05.13 15:00, Ram Rachum написав(ла):

...

I was thinking about the implementation of the factorial in Python. I was comparing in my mind 2 different solutions: The recursive one, and the one that uses a loop.

And math.factorial() does not use any of them.

Joao S. O. Bueno

11:01 a.m.

And, besides everybody else's comments, I'd like to point out that tail recursion elimination _is_ feasible in Python as it is today. One can create a decorator meant only for suitable tail-recursion functions, that makes the bookkeeping of whenever the funciton is called - and in this book-keeping make the call to the decorated function inside a try-except block. If it is called reursively, it them raises an apropriate exception, that will make the execution continue on the frame-stack level of the first stack. I have a blog post with such a toy - nevertheless, it is just a toy. (If ther ewas soem small problem that could be elegantly approached in this fashion but not interactively, it could be used in production though) http://metapython.blogspot.com.br/2010/11/tail-recursion-elimination-in-pyth... js -><- On 26 May 2013 09:00, Ram Rachum <ram.rachum@gmail.com> wrote:

...

Hi everybody,

Here's an idea I had a while ago. Now, I'm an ignoramus when it comes to how programming languages are implemented, so this idea will most likely be either (a) completely impossible or (b) trivial knowledge.

I was thinking about the implementation of the factorial in Python. I was comparing in my mind 2 different solutions: The recursive one, and the one that uses a loop. Here are example implementations for them:

def factorial_recursive(n): if n == 1: return 1 return n * factorial_recursive(n - 1)

def factorial_loop(n): result = 1 for i in range(1, n + 1): result *= i return result

I know that the recursive one is problematic, because it's putting a lot of items on the stack. In fact it's using the stack as if it was a loop variable. The stack wasn't meant to be used like that.

Then the question came to me, why? Maybe the stack could be built to handle this kind of (ab)use?

I read about tail-call optimization on Wikipedia. If I understand correctly, the gist of it is that the interpreter tries to recognize, on a frame-by-frame basis, which frames could be completely eliminated, and then it eliminates those. Then I read Guido's blog post explaining why he doesn't want it in Python. In that post he outlined 4 different reasons why TCO shouldn't be implemented in Python.

But then I thought, maybe you could do something smarter than eliminating individual stack frames. Maybe we could create something that is to the current implementation of the stack what `xrange` is to the old-style `range`. A smart object that allows access to any of a long list of items in it, without actually having to store those items. This would solve the first argument that Guido raises in his post, which I found to be the most substantial one.

What I'm saying is: Imagine the stack of the interpreter when it runs the factorial example above for n=1000. It has around 1000 items in it and it's just about to explode. But then, if you'd look at the contents of that stack, you'd see it's embarrassingly regular, a compression algorithm's wet dream. It's just the same code location over and over again, with a different value for `n`.

So what I'm suggesting is an algorithm to compress that stack on the fly. An algorithm that would detect regularities in the stack and instead of saving each individual frame, save just the pattern. Then, there wouldn't be any problem with showing informative stack trace: Despite not storing every individual frame, each individual frame could still be accessed, similarly to how `xrange` allow access to each individual member without having to store each of them.

Then, the stack could store a lot more items, and tasks that currently require recursion (like pickling using the standard library) will be able to handle much deeper recursions.

What do you think?

Ram.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Chris Angelico

1:30 p.m.

On Mon, May 27, 2013 at 9:01 PM, Joao S. O. Bueno <jsbueno@python.org.br> wrote:

...

I have a blog post with such a toy - nevertheless, it is just a toy. (If ther ewas soem small problem that could be elegantly approached in this fashion but not interactively, it could be used in production though)

What can tail recursion do that can't be done by reassigning to the function parameters and 'goto' back to the top? Or, in the absence of an actual goto, a construct like this: def tail_recursive_function(some_arg): while True: # ... code if call_self: # return tail_recursive_function(some_other_arg) some_arg = some_other_arg continue # ... more code # falling off the end: break which basically amounts to a goto but using extra keywords to avoid the one that people hate. Is there any fundamental difference? I've never understood there to be any, but I'm only one, and possibly I'm wrong. ChrisA

Ned Batchelder

1:52 p.m.

On 5/27/2013 9:30 AM, Chris Angelico wrote:

...

On Mon, May 27, 2013 at 9:01 PM, Joao S. O. Bueno <jsbueno@python.org.br> wrote:

...
I have a blog post with such a toy - nevertheless, it is just a toy. (If ther ewas soem small problem that could be elegantly approached in this fashion but not interactively, it could be used in production though) What can tail recursion do that can't be done by reassigning to the function parameters and 'goto' back to the top? Or, in the absence of an actual goto, a construct like this:

def tail_recursive_function(some_arg): while True: # ... code if call_self: # return tail_recursive_function(some_other_arg) some_arg = some_other_arg continue # ... more code # falling off the end: break

which basically amounts to a goto but using extra keywords to avoid the one that people hate. Is there any fundamental difference? I've never understood there to be any, but I'm only one, and possibly I'm wrong.

That style can't handle mutually recursive procedures, or the extreme case: a state machine implemented with N functions, each of which calls the next state function at the end. Tail-call elimination isn't simply about noticing recursive calls. It's about noticing that a function ends with a function call, and not burning another stack frame in the process. --Ned.

...

ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Chris Angelico

2:05 p.m.

On Mon, May 27, 2013 at 11:52 PM, Ned Batchelder <ned@nedbatchelder.com> wrote:

...

On 5/27/2013 9:30 AM, Chris Angelico wrote:

...
What can tail recursion do that can't be done by reassigning to the function parameters and 'goto' back to the top?

That style can't handle mutually recursive procedures, or the extreme case: a state machine implemented with N functions, each of which calls the next state function at the end. Tail-call elimination isn't simply about noticing recursive calls. It's about noticing that a function ends with a function call, and not burning another stack frame in the process.

Ahh, gotcha. Of course. Mutual recursion would be a bit more of a problem to the compressor, too, though; it'd have to recognize a pattern that spans multiple stack frames. ChrisA

Andrew Barnert

28 May 28 May

7:30 a.m.

On May 27, 2013, at 7:05, Chris Angelico <rosuav@gmail.com> wrote:

...

On Mon, May 27, 2013 at 11:52 PM, Ned Batchelder <ned@nedbatchelder.com> wrote:

...
On 5/27/2013 9:30 AM, Chris Angelico wrote:

...
What can tail recursion do that can't be done by reassigning to the function parameters and 'goto' back to the top?

That style can't handle mutually recursive procedures, or the extreme case: a state machine implemented with N functions, each of which calls the next state function at the end. Tail-call elimination isn't simply about noticing recursive calls. It's about noticing that a function ends with a function call, and not burning another stack frame in the process.

Ahh, gotcha. Of course. Mutual recursion would be a bit more of a problem to the compressor, too, though; it'd have to recognize a pattern that spans multiple stack frames.

That's exactly why I suggested that it would be more interesting in a PyPy-style tracing JIT than in a static compiler. At runtime, you can detect that your traced fast path has a FUNC opcode that calls a function whose body is already part of the trace, and turn that into code that stashes just enough information to generate stack frames if needed, then do a goto. In simple cases, that stashing could amount to no more than incrementing a counter; in complex cases, I think it might be closely related to stashing enough info to create generator states if you fall off the fast path. That should be much simpler than trying to detect patterns that will lead to tail calls. And it means you only make the effort when it makes a difference. But, most importantly, it means you can benefit even in non-tail-call cases, like the naive recursion in the original post. Even when you can't eliminate recursion you will still "compress" the state. And it might also let you optimize chains of generator yields or coroutine sends with not much extra effort (which might even be a basis for optimizing out explicit coroutine trampolines, which would be very cool). This is pretty different from traditional TCO, but given that TCO isn't relevant to the OP's proposal or to his example, I don't think that's a problem.

Neil Girdhar

11 Sep 11 Sep

9:44 p.m.

You can just use a memoize decorator to automatically convert your recursive solution to a fast linear time one. Search for "python memoization decorator". Best, Neil On Sunday, May 26, 2013 8:00:13 AM UTC-4, Ram Rachum wrote:

...

Hi everybody,

Here's an idea I had a while ago. Now, I'm an ignoramus when it comes to how programming languages are implemented, so this idea will most likely be either (a) completely impossible or (b) trivial knowledge.

I was thinking about the implementation of the factorial in Python. I was comparing in my mind 2 different solutions: The recursive one, and the one that uses a loop. Here are example implementations for them:

def factorial_recursive(n): if n == 1: return 1 return n * factorial_recursive(n - 1)

def factorial_loop(n): result = 1 for i in range(1, n + 1): result *= i return result

I know that the recursive one is problematic, because it's putting a lot of items on the stack. In fact it's using the stack as if it was a loop variable. The stack wasn't meant to be used like that.

Then the question came to me, why? Maybe the stack could be built to handle this kind of (ab)use?

I read about tail-call optimization on Wikipedia. If I understand correctly, the gist of it is that the interpreter tries to recognize, on a frame-by-frame basis, which frames could be completely eliminated, and then it eliminates those. Then I read Guido's blog post explaining why he doesn't want it in Python. In that post he outlined 4 different reasons why TCO shouldn't be implemented in Python.

But then I thought, maybe you could do something smarter than eliminating individual stack frames. Maybe we could create something that is to the current implementation of the stack what `xrange` is to the old-style `range`. A smart object that allows access to any of a long list of items in it, without actually having to store those items. This would solve the first argument that Guido raises in his post, which I found to be the most substantial one.

What I'm saying is: Imagine the stack of the interpreter when it runs the factorial example above for n=1000. It has around 1000 items in it and it's just about to explode. But then, if you'd look at the contents of that stack, you'd see it's embarrassingly regular, a compression algorithm's wet dream. It's just the same code location over and over again, with a different value for `n`.

So what I'm suggesting is an algorithm to compress that stack on the fly. An algorithm that would detect regularities in the stack and instead of saving each individual frame, save just the pattern. Then, there wouldn't be any problem with showing informative stack trace: Despite not storing every individual frame, each individual frame could still be *accessed*, similarly to how `xrange` allow access to each individual member without having to store each of them.

Then, the stack could store a lot more items, and tasks that currently require recursion (like pickling using the standard library) will be able to handle much deeper recursions.

What do you think?

Ram.

Neil Girdhar

10:15 p.m.

You can just use a memoize decorator to automatically convert your recursive solution to a fast linear time one. Search for "python memoization decorator". This would make a much broader range of recursive solution run in linear time. Best, Neil On Sunday, May 26, 2013 8:00:13 AM UTC-4, Ram Rachum wrote:

...

Hi everybody,

Here's an idea I had a while ago. Now, I'm an ignoramus when it comes to how programming languages are implemented, so this idea will most likely be either (a) completely impossible or (b) trivial knowledge.

I was thinking about the implementation of the factorial in Python. I was comparing in my mind 2 different solutions: The recursive one, and the one that uses a loop. Here are example implementations for them:

def factorial_recursive(n): if n == 1: return 1 return n * factorial_recursive(n - 1)

def factorial_loop(n): result = 1 for i in range(1, n + 1): result *= i return result

I know that the recursive one is problematic, because it's putting a lot of items on the stack. In fact it's using the stack as if it was a loop variable. The stack wasn't meant to be used like that.

Then the question came to me, why? Maybe the stack could be built to handle this kind of (ab)use?

I read about tail-call optimization on Wikipedia. If I understand correctly, the gist of it is that the interpreter tries to recognize, on a frame-by-frame basis, which frames could be completely eliminated, and then it eliminates those. Then I read Guido's blog post explaining why he doesn't want it in Python. In that post he outlined 4 different reasons why TCO shouldn't be implemented in Python.

But then I thought, maybe you could do something smarter than eliminating individual stack frames. Maybe we could create something that is to the current implementation of the stack what `xrange` is to the old-style `range`. A smart object that allows access to any of a long list of items in it, without actually having to store those items. This would solve the first argument that Guido raises in his post, which I found to be the most substantial one.

What I'm saying is: Imagine the stack of the interpreter when it runs the factorial example above for n=1000. It has around 1000 items in it and it's just about to explode. But then, if you'd look at the contents of that stack, you'd see it's embarrassingly regular, a compression algorithm's wet dream. It's just the same code location over and over again, with a different value for `n`.

So what I'm suggesting is an algorithm to compress that stack on the fly. An algorithm that would detect regularities in the stack and instead of saving each individual frame, save just the pattern. Then, there wouldn't be any problem with showing informative stack trace: Despite not storing every individual frame, each individual frame could still be *accessed*, similarly to how `xrange` allow access to each individual member without having to store each of them.

Then, the stack could store a lot more items, and tasks that currently require recursion (like pickling using the standard library) will be able to handle much deeper recursions.

What do you think?

Ram.

Westley Martínez

12 Sep 12 Sep

4:12 a.m.

New subject: FW: Idea: Compressing the stack on the fly

-----Original Message----- From: Westley Martínez [mailto:anikom15@gmail.com] Sent: Wednesday, September 11, 2013 9:03 PM To: 'Ram Rachum'; 'python-ideas@googlegroups.com' Cc: 'Ram Rachum' Subject: RE: [Python-ideas] Idea: Compressing the stack on the fly

...

-----Original Message----- From: Python-ideas [mailto:python-ideas- bounces+anikom15=gmail.com@python.org] On Behalf Of Ram Rachum Sent: Sunday, May 26, 2013 5:00 AM To: python-ideas@googlegroups.com Cc: Ram Rachum Subject: [Python-ideas] Idea: Compressing the stack on the fly

So what I'm suggesting is an algorithm to compress that stack on the fly. An algorithm that would detect regularities in the stack and instead of saving each individual frame, save just the pattern. Then, there wouldn't be any problem with showing informative stack trace: Despite not storing every individual frame, each individual frame could still be accessed, similarly to how `xrange` allow access to each individual member without having to store each of them.

Then, the stack could store a lot more items, and tasks that currently require recursion (like pickling using the standard library) will be able to handle much deeper recursions.

What do you think?

I think this is an interesting idea. It sounds possible, but the question is whether or not it can be efficiently done with Python. I'd heed Guido's advice in first implementing this. It could probably be done effectively with a compiled language like C, but I'd imagine it'd be too difficult for Python. The other question is usability. What would this actually be used for. I'm not a fan of recursion. I think anything that uses recursion could be restructured into something simpler. A lot of people find recursion to be elegant. For me it just hurts my brain.

Joshua Landau

4:29 a.m.

New subject: FW: Idea: Compressing the stack on the fly

Does anyone actually write recursive Python code where the recursion in a significant bottleneck? The only such code I can think of is either for a tree, in which case stack depth is irrelevant, or bad code. Why would anyone care, basically?

Clay Sweetser

4:46 a.m.

New subject: FW: Idea: Compressing the stack on the fly

This sounds like something that the PyPy team might be interested in.

M.-A. Lemburg

6:59 a.m.

New subject: FW: Idea: Compressing the stack on the fly

On 12.09.2013 06:29, Joshua Landau wrote:

...

Does anyone actually write recursive Python code where the recursion in a significant bottleneck? The only such code I can think of is either for a tree, in which case stack depth is irrelevant, or bad code.

Any kind of backtracking algorithm will need recursion or a separate stack data structure to keep track of the various decisions made up to a certain point on the path. The C stack is rather limited in size, so a recursive parser can easily blow up if it uses the C stack alone for managing backtracking. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 12 2013)

...

...
...
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-04: Released eGenix pyOpenSSL 0.13.2 ... http://egenix.com/go48 2013-09-20: PyCon UK 2013, Coventry, UK ... 8 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

Joshua Landau

7:03 a.m.

New subject: FW: Idea: Compressing the stack on the fly

On 12 September 2013 07:59, M.-A. Lemburg <mal@egenix.com> wrote:

...

On 12.09.2013 06:29, Joshua Landau wrote:

...
Does anyone actually write recursive Python code where the recursion in a significant bottleneck? The only such code I can think of is either for a tree, in which case stack depth is irrelevant, or bad code.

Any kind of backtracking algorithm will need recursion or a separate stack data structure to keep track of the various decisions made up to a certain point on the path.

The C stack is rather limited in size, so a recursive parser can easily blow up if it uses the C stack alone for managing backtracking.

What sort of algorithm would backtrack that many times? I doubt a parser would and I can't think of anything worse ATM.

Chris Angelico

8:21 a.m.

New subject: FW: Idea: Compressing the stack on the fly

On Thu, Sep 12, 2013 at 5:03 PM, Joshua Landau <joshua@landau.ws> wrote:

...

On 12 September 2013 07:59, M.-A. Lemburg <mal@egenix.com> wrote:

...
On 12.09.2013 06:29, Joshua Landau wrote:

...
Does anyone actually write recursive Python code where the recursion in a significant bottleneck? The only such code I can think of is either for a tree, in which case stack depth is irrelevant, or bad code.

Any kind of backtracking algorithm will need recursion or a separate stack data structure to keep track of the various decisions made up to a certain point on the path.

The C stack is rather limited in size, so a recursive parser can easily blow up if it uses the C stack alone for managing backtracking.

What sort of algorithm would backtrack that many times? I doubt a parser would and I can't think of anything worse ATM.

Solve chess. ChrisA

Joshua Landau

9:11 a.m.

New subject: FW: Idea: Compressing the stack on the fly

On 12 September 2013 09:21, Chris Angelico <rosuav@gmail.com> wrote:

...

On Thu, Sep 12, 2013 at 5:03 PM, Joshua Landau <joshua@landau.ws> wrote:

...
On 12 September 2013 07:59, M.-A. Lemburg <mal@egenix.com> wrote:

...
On 12.09.2013 06:29, Joshua Landau wrote:

...
Does anyone actually write recursive Python code where the recursion in a significant bottleneck? The only such code I can think of is either for a tree, in which case stack depth is irrelevant, or bad code.

Any kind of backtracking algorithm will need recursion or a separate stack data structure to keep track of the various decisions made up to a certain point on the path.

The C stack is rather limited in size, so a recursive parser can easily blow up if it uses the C stack alone for managing backtracking.

What sort of algorithm would backtrack that many times? I doubt a parser would and I can't think of anything worse ATM.

Solve chess.

If you're managing to simulate more than 1000 moves ahead either you're doing depth first or you've got a *blisteringly* fast computer.

M.-A. Lemburg

8:34 a.m.

New subject: FW: Idea: Compressing the stack on the fly

On 12.09.2013 09:03, Joshua Landau wrote:

...

On 12 September 2013 07:59, M.-A. Lemburg <mal@egenix.com> wrote:

...
On 12.09.2013 06:29, Joshua Landau wrote:

...
Does anyone actually write recursive Python code where the recursion in a significant bottleneck? The only such code I can think of is either for a tree, in which case stack depth is irrelevant, or bad code.

Any kind of backtracking algorithm will need recursion or a separate stack data structure to keep track of the various decisions made up to a certain point on the path.

The C stack is rather limited in size, so a recursive parser can easily blow up if it uses the C stack alone for managing backtracking.

What sort of algorithm would backtrack that many times? I doubt a parser would and I can't think of anything worse ATM.

Oh, that's easy. It just depends on the given data set that you're working on and how often you have to branch when working on it. http://en.wikipedia.org/wiki/Backtracking lists a few problems. Here's a regular expression example that would blow the stack, if the re module were still using it (it was fixed in 2003 to no longer do): re.match('(.*a|.*b|x)+', 'x' * 100000) The expression still uses exponential time, though. With Python 2.3, you see the stack limit error: Python 2.3.5 (#1, Aug 24 2011, 15:52:42) [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2 Type "help", "copyright", "credits" or "license" for more information.

...

...
...
import re re.match('(.*a|.*b|x)+', 'x' * 100000) Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/local/python-2.3-ucs2/lib/python2.3/sre.py", line 132, in match return _compile(pattern, flags).match(string) RuntimeError: maximum recursion limit exceeded

-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 12 2013)

...

...
...
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

Joshua Landau

9:18 a.m.

New subject: FW: Idea: Compressing the stack on the fly

On 12 September 2013 09:34, M.-A. Lemburg <mal@egenix.com> wrote:

...

On 12.09.2013 09:03, Joshua Landau wrote:

...
On 12 September 2013 07:59, M.-A. Lemburg <mal@egenix.com> wrote:

...
On 12.09.2013 06:29, Joshua Landau wrote:

...
Does anyone actually write recursive Python code where the recursion in a significant bottleneck? The only such code I can think of is either for a tree, in which case stack depth is irrelevant, or bad code.

Any kind of backtracking algorithm will need recursion or a separate stack data structure to keep track of the various decisions made up to a certain point on the path.

The C stack is rather limited in size, so a recursive parser can easily blow up if it uses the C stack alone for managing backtracking.

What sort of algorithm would backtrack that many times? I doubt a parser would and I can't think of anything worse ATM.

Oh, that's easy. It just depends on the given data set that you're working on and how often you have to branch when working on it.

http://en.wikipedia.org/wiki/Backtracking lists a few problems.

Here's a regular expression example that would blow the stack, if the re module were still using it (it was fixed in 2003 to no longer do):

re.match('(.*a|.*b|x)+', 'x' * 100000)

The expression still uses exponential time, though.

Ah, Regex. 'Could'a guessed. I'll file that under "bad code". *wink*

Oscar Benjamin

9:57 a.m.

New subject: FW: Idea: Compressing the stack on the fly

On 12 September 2013 05:29, Joshua Landau <joshua@landau.ws> wrote:

...

Does anyone actually write recursive Python code where the recursion in a significant bottleneck? The only such code I can think of is either for a tree, in which case stack depth is irrelevant, or bad code.

Why would anyone care, basically?

I think you're asking this question the wrong way. Recursion isn't a bottleneck that slows down your program. When you hit the recursion limit your program just blows up. Since Python doesn't have the optimisations that make any particular kind of recursion scale well people do not generally use it unless they know that the depth is small enough. Currently code that is susceptible to hitting the recursion limit is "bad code" because it depends on optimisations that don't exist. However, if the optimisations did exist then people could choose to take advantage of them. As an example, I once implemented Tarjan's algorithm in Python using the recursive form shown here: http://en.wikipedia.org/wiki/Tarjan's_strongly_connected_components_algorithm#The_algorithm_in_pseudocode After implementing it and confirming that it worked I immediately found that it hit the recursion limit in my real problem. So I reimplemented it without the recursion. Had there been optimisations that would have made the reimplementation unnecessary I would have happily stuck with the first form since it was easier to understand than the explicit stack of iterators version that I ended up with. For the same reasons you won't see much code out there where recursion is a bottleneck unless, as you say, it is "bad code". Oscar

Andrew Barnert

5:26 a.m.

New subject: FW: Idea: Compressing the stack on the fly

Why is everyone suddenly responding to a thread that died months ago? If anyone really wants to re-propose the idea, they should at least go back and look over the discussion that followed it. ----- Original Message -----

...

From: Westley Martínez <anikom15@gmail.com> To: python-ideas@python.org Cc: Sent: Wednesday, September 11, 2013 9:12 PM Subject: [Python-ideas] FW: Idea: Compressing the stack on the fly

-----Original Message----- From: Westley Martínez [mailto:anikom15@gmail.com] Sent: Wednesday, September 11, 2013 9:03 PM To: 'Ram Rachum'; 'python-ideas@googlegroups.com' Cc: 'Ram Rachum' Subject: RE: [Python-ideas] Idea: Compressing the stack on the fly

...
-----Original Message----- From: Python-ideas [mailto:python-ideas- bounces+anikom15=gmail.com@python.org] On Behalf Of Ram Rachum Sent: Sunday, May 26, 2013 5:00 AM To: python-ideas@googlegroups.com Cc: Ram Rachum Subject: [Python-ideas] Idea: Compressing the stack on the fly

So what I'm suggesting is an algorithm to compress that stack on the fly. An algorithm that would detect regularities in the stack and instead of saving each individual frame, save just the pattern. Then, there wouldn't be any problem with showing informative stack trace: Despite not storing every individual frame, each individual frame could still be accessed, similarly to how `xrange` allow access to each individual member without having to store each of them.

Then, the stack could store a lot more items, and tasks that currently require recursion (like pickling using the standard library) will be able to handle much deeper recursions.

What do you think?

I think this is an interesting idea. It sounds possible, but the question is whether or not it can be efficiently done with Python.

I'd heed Guido's advice in first implementing this. It could probably be done effectively with a compiled language like C, but I'd imagine it'd be too difficult for Python.

The other question is usability. What would this actually be used for. I'm not a fan of recursion. I think anything that uses recursion could be restructured into something simpler. A lot of people find recursion to be elegant. For me it just hurts my brain.

_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas

4042

Age (days ago)

4151

Last active (days ago)

List overview

Download

20 comments

13 participants

participants (13)

Andrew Barnert
Chris Angelico
Clay Sweetser
Joao S. O. Bueno
Joshua Landau
M.-A. Lemburg
Ned Batchelder
Neil Girdhar
Oscar Benjamin
Ram Rachum
Serhiy Storchaka
Terry Jan Reedy
Westley Martínez

Idea: Compressing the stack on the fly

Westley Martínez

tags

participants (13)