Interactive Debugging of Python
data:image/s3,"s3://crabby-images/2750e/2750e63c84b199213156f78d970da1f5b8cd75be" alt=""
All this talk about stack frames and manipulating them at runtime has reminded me of one of my biggest gripes about Python. When I say "biggest gripe", I really mean "biggest surprise" or "biggest shame". That is, Python is very interactive and dynamic. However, when I am debugging Python, it seems to lose this. There is no way for me to effectively change a running program. Now with VC6, I can do this with C. Although it is slow and a little dumb, I can change the C side of my Python world while my program is running, but not the Python side of the world. Im wondering how feasable it would be to change Python code _while_ running under the debugger. Presumably this would require a way of recompiling the current block of code, patching this code back into the object, and somehow tricking the stack frame to use this new block of code; even if a first-cut had to restart the block or somesuch... Any thoughts on this? Mark.
data:image/s3,"s3://crabby-images/d3fe4/d3fe4247fa540cafafc59b32002fbfea3eed8b3a" alt=""
Mark Hammond wrote:
I'm writing a prototype of a stackless Python, which means that you will be able to access the current state of the interpreter completely. The inner interpreter loop will be isolated from the frame dispatcher. It will break whenever the ticker goes zero. If you set the ticker to one, you will be able to single step on every opcode, have the value stack, the frame chain, everything. I think, with this you can do very much. But tell me if you want a callback hook somewhere. ciao - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/2750e/2750e63c84b199213156f78d970da1f5b8cd75be" alt=""
I think the main point is how to change code when a Python frame already references it. I dont think the structure of the frames is as important as the general concept. But while we were talking frame-fiddling it seemed a good point to try and hijack it a little :-) Would it be possible to recompile just a block of code (eg, just the current function or method) and patch it back in such a way that the current frame continues execution of the new code? I feel this is somewhat related to the inability to change class implementation for an existing instance. I know there have been hacks around this before but they arent completly reliable and IMO it would be nice if the core Python made it easier to change already running code - whether that code is in an existing stack frame, or just in an already created instance, it is very difficult to do. This has come to try and deflect some conversation away from changing Python as such towards an attempt at enhancing its _environment_. To paraphrase many people before me, even if we completely froze the language now there would still plenty of work ahead of us :-) Mark.
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
This topic sounds mostly unrelated to the stackless discussion -- in either case you need to be able to fiddle the contents of the frame and the bytecode pointer to reflect the changed function. Some issues: - The slots containing local variables may be renumbered after recompilation; fortunately we know the name--number mapping so we can move them to their new location. But it is still tricky. - Should you be able to edit functions that are present on the call stack below the top? Suppose we have two functions: def f(): return 1 + g() def g(): return 0 Suppose set a break in g(), and then edit the source of f(). We can do all sorts of evil to f(): e.g. we could change it to return g() + 2 which affects the contents of the value stack when g() returns (originally, the value stack contained the value 1, now it is empty). Or we could even change f() to return 3 thereby eliminating the call to g() altogether! What kind of limitations do other systems that support modifying a "live" program being debugged impose? Only allowing modification of the function at the top of the stack might eliminate some problems, although there are still ways to mess up. The value stack is not always empty even when we only stop at statement boundaries -- e.g. it contains 'for' loop indices, and there's also the 'block' stack, which contains try-except information. E.g. what should happen if we change def f(): for i in range(10): print 1 stopped at the 'print 1' into def f(): print 1 ??? (Ditto for removing or adding a try/except block.)
I've been thinking a bit about this. Function objects now have mutable func_code attributes (and also func_defaults), I think we can use this. The hard part is to do the analysis needed to decide which functions to recompile! Ideally, we would simply edit a file and tell the programming environment "recompile this". The programming environment would compare the changed file with the old version that it had saved for this purpose, and notice (for example) that we changed two methods of class C. It would then recompile those methods only and stuff the new code objects in the corresponding function objects. But what would it do when we changed a global variable? Say a module originally contains a statement "x = 0". Now we change the source code to say "x = 100". Should we change the variable x? Suppose that x is modified by some of the computations in the module, and the that, after some computations, the actual value of x was 50. Should the "recompile" reset x to 100 or leave it alone? One option would be to actually change the semantics of the class and def statements so that they modify an existing class or function rather than using assignment. Effectively, this proposal would change the semantics of class A: ...some code... class A: ...some more code... to be the same as class A: ...more code... ...some more code... This is somewhat similar to the way the module or package commands in some other dynamic languages work, I think; and I don't think this would break too much existing code. The proposal would also change def f(): ...some code... def f(): ...other code... but here the equivalence is not so easy to express, since I want different semantics (I don't want the second f's code to be tacked onto the end of the first f's code). If we understand that def f(): ... really does the following: f = NewFunctionObject() f.func_code = ...code object... then the construct above (def f():... def f(): ...) would do this: f = NewFunctionObject() f.func_code = ...some code... f.func_code = ...other code... i.e. there is no assignment of a new function object for the second def. Of course if there is a variable f but it is not a function, it would have to be assigned a new function object first. But in the case of def, this *does* break existing code. E.g. # module A from B import f . . . if ...some test...: def f(): ...some code... This idiom conditionally redefines a function that was also imported from some other module. The proposed new semantics would change B.f in place! So perhaps these new semantics should only be invoked when a special "reload-compile" is asked for... Or perhaps the programming environment could do this through source parsing as I proposed before...
Please, no more posts about Scheme. Each new post mentioning call/cc makes it *less* likely that something like that will ever be part of Python. "What if Guido's brain exploded?" :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/d4610/d4610167fb99aff56ebc2d699165eebfb614c9c5" alt=""
Guido> What kind of limitations do other systems that support modifying Guido> a "live" program being debugged impose? Only allowing Guido> modification of the function at the top of the stack might Guido> eliminate some problems, although there are still ways to mess Guido> up. Frame objects maintain pointers to the active code objects, locals and globals, so modifying a function object's code or globals shouldn't have any effect on currently executing frames, right? I assume frame objects do the usual INCREF/DECREF dance, so the old code object won't get deleted before the frame object is tossed. Guido> But what would it do when we changed a global variable? Say a Guido> module originally contains a statement "x = 0". Now we change Guido> the source code to say "x = 100". Should we change the variable Guido> x? Suppose that x is modified by some of the computations in the Guido> module, and the that, after some computations, the actual value Guido> of x was 50. Should the "recompile" reset x to 100 or leave it Guido> alone? I think you should note the change for users and give them some way to easily pick between old initial value, new initial value or current value. Guido> Please, no more posts about Scheme. Each new post mentioning Guido> call/cc makes it *less* likely that something like that will ever Guido> be part of Python. "What if Guido's brain exploded?" :-) I agree. I see call/cc or set! and my eyes just glaze over... Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583
data:image/s3,"s3://crabby-images/2750e/2750e63c84b199213156f78d970da1f5b8cd75be" alt=""
[Guido writes...]
This topic sounds mostly unrelated to the stackless discussion -- in
Sure is - I just saw that as an excuse to try and hijack it <wink>
Some issues:
- The slots containing local variables may be renumbered after
Generally, I think we could make something very useful even with a number of limitations. For example, I would find a first cut completely acceptable and a great improvement on today if: * Only the function at the top of the stack can be recompiled and have the code reflected while executing. This function also must be restarted after such an edit. If the function uses global variables or makes calls that restarting will screw-up, then either a) make the code changes _before_ doing this stuff, or b) live with it for now, and help us remove the limitation :-) That may make the locals being renumbered easier to deal with, and also remove some of the problems you discussed about editing functions below the top.
What kind of limitations do other systems that support modifying a "live" program being debugged impose? Only allowing modification of
I can only speak for VC, and from experience at that - I havent attempted to find documentation on it. It accepts most changes while running. The current line is fine. If you create or change the definition of globals (and possibly even the type of locals?), the "incremental compilation" fails, and you are given the option of continuing with the old code, or stopping the process and doing a full build. When the debug session terminates, some link process (and maybe even compilation?) is done to bring the .exe on disk up to date with the changes. If you do wierd stuff like delete the line being executed, it usually gives you some warning message before either restarting the function or trying to pick a line somewhere near the line you deleted. Either way, it can screw up, moving the "current" line somewhere else - it doesnt crash the debugger, but may not do exactly what you expected. It is still a _huge_ win, and a great feature! Ironically, I turn this feature _off_ for Python extensions. Although changing the C code is great, in 99% of the cases I also need to change some .py code, and as existing instances are affected I need to restart the app anyway - so I may as well do a normal build at that time. ie, C now lets me debug incrementally, but a far more dynamic language prevents this feature being useful ;-)
If we forced a restart would this be better? Can we reliably reset the stack to the start of the current function?
If this would work for the few changed functions/methods, what would the impact be of doing it for _every_ function (changed or not)? Then the analysis can drop to the module level which is much easier. I dont think a slight performace hit is a problem at all when doing this stuff.
Or extending this (didnt this come up at the latest IPC?) # .\package\__init__.py class BigMutha: pass # .\package\something.py class package.BigMutha: def some_category_of_methods(): ... # .\package\other.py class package.BigMutha: def other_category_of_methods(): ... [Of course, this wont fly as it stands; just a conceptual possibility]
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
OK, restarting the function seems a reasonable compromise and would seem relatively easy to implement. Not *real* easy though: it turns out that eval_code2() is called with a code object as argument, and it's not entirely trivial to figure out the corresponding function object from which to grab the new code object. But it could be done -- give it a try. (Don't wait for me, I'm ducking for cover until at least mid June.)
I hear you.
If we forced a restart would this be better? Can we reliably reset the stack to the start of the current function?
Yes, no problem.
Yes, this would be fine too.
Have no fear. I've learned to say no. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[GvR]
As an ex-compiler guy, I should have something wise to say about that. Alas, I've never used a system that allowed more than poking new values into vrbls, and the thought of any more than that makes me vaguely ill! Oh, that's right -- I'm vaguely ill anyway today. Still-- oooooh -- the problems. This later got reduced to restarting the topmost function from scratch. That has some attraction, especially on the bang-for-buck-o-meter.
What a pussy <wink>. Really, overall continuations are much less trouble to understand than threads -- there's only one function in the entire interface! OK. So how do you feel about coroutines? Would sure be nice to have *some* way to get pseudo-parallel semantics regardless of OS. changing-code-on-the-fly-==-mutating-the-current-continuation-ly y'rs - tim
data:image/s3,"s3://crabby-images/15fc2/15fc2aa489ef203f88c82c233466a905b703a2ca" alt=""
On Fri, 21 May 1999, Tim Peters wrote:
OK. So how do you feel about coroutines? Would sure be nice to have *some* way to get pseudo-parallel semantics regardless of OS.
I read about coroutines years ago on c.l.py, but I admit I forgot it all. Can you explain them briefly in pseudo-python? --david
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Tim]
OK. So how do you feel about coroutines? Would sure be nice to have *some* way to get pseudo-parallel semantics regardless of OS.
[David Ascher]
I read about coroutines years ago on c.l.py, but I admit I forgot it all. Can you explain them briefly in pseudo-python?
How about real Python? http://www.python.org/tim_one/000169.html contains a complete coroutine implementation using threads under the covers (& exactly 5 years old tomorrow <wink>). If I were to do it over again, I'd use a different object interface (making coroutines objects in their own right instead of funneling everything through a "coroutine controller" object), but the ideas are the same in every coroutine language. The post contains several executable examples, from simple to "literature standard". I had forgotten all about this: it contains solutions to the same "compare tree fringes" problem Sam mentioned, *and* the generator-based building block I posted three other solutions for in this thread. That last looks like: # fringe visits a nested list in inorder, and detaches for each non-list # element; raises EarlyExit after the list is exhausted def fringe( co, list ): for x in list: if type(x) is type([]): fringe(co, x) else: co.detach(x) def printinorder( list ): co = Coroutine() f = co.create(fringe, co, list) try: while 1: print co.tran(f), except EarlyExit: pass print printinorder([1,2,3]) # 1 2 3 printinorder([[[[1,[2]]],3]]) # ditto x = [0, 1, [2, [3]], [4,5], [[[6]]] ] printinorder(x) # 0 1 2 3 4 5 6 Generators are really "half a coroutine", so this doesn't show the full power (other examples in the post do). co.detach is a special way to deal with this asymmetry. In the general case you use co.tran all the time, where (see the post for more info) v = co.tran(c [, w]) means "resume coroutine c from the place it last did a co.tran, optionally passing it the value w, and when somebody does a co.tran back to *me*, resume me right here, binding v to the value *they* pass to co.tran ). Knuth complains several times that it's very hard to come up with a coroutine example that's both simple and clear <0.5 wink>. In a nutshell, coroutines don't have a "caller/callee" relationship, they have "we're all equal partners" relationship, where any coroutine is free to resume any other one where it left off. It's no coincidence that making coroutines easy to use was pioneered by simulation languages! Just try simulating a marriage where one partner is the master and the other a slave <wink>. i-may-be-a-bachelor-but-i-have-eyes-ly y'rs - tim
data:image/s3,"s3://crabby-images/d3fe4/d3fe4247fa540cafafc59b32002fbfea3eed8b3a" alt=""
Tim Peters wrote:
What an interesting thread! Unfortunately, all the examples are messed up since some HTML formatter didn't take care of the python code, rendering it unreadable. Is there a different version available? Also, I'd like to read the rest of the threads in http://www.python.org/tim_one/ but it seems that only your messages are archived? Anyway, the citations in http://www.python.org/tim_one/000146.html show me that you have been through all of this five years ago, with a five years younger Guido which sounds a bit different than today. I had understood him better if I had known that this is a re-iteration of a somehow dropped or entombed idea. (If someone has the original archives from that epoche, I'd be happy to get a copy. Actually, I'm missing all upto end of 1996.) A sort snapshot: Stackless Python is meanwhile nearly alive, with recursion avoided in ceval. Of course, some modules are left which still need work, but enough for a prototype. Frames contain now all necessry state and are now prepared for execution and thrown back to the evaluator (elevator?). The key idea was to change the deeply nested functions in a way, that their last eval_code call happens to be tail recursive. In ceval.c (and in other not yet changed places), functions to a lot of preparation, build some parameter, call eval_code and release the parameter. This was the crux, which I solved by a new filed in the frame object, where such references can be stored. The routine can now return with the ready packaged frame, instead of calling it. As a minimum facility for future co-anythings, I provided a hook function for resuming frames, which causes no overhead in the usual case but allows to override what a frame does when someone returns control to it. To implement this is due to some extension module, wether this may be coroutines or your nice nano-threads, it's possible. threadedly yours - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Christian]
Yes, that link is the old Pythonic Award Shrine erected in my memory -- it's all me, all the time, no mercy, no escape <wink>. It predates the DejaNews archive, but the context can still be found in http://www.python.org/search/hypermail/python-1994q2/index.html There's a lot in that quarter about continuations & coroutines, most from Steven Majewski, who took a serious shot at implementing all this. Don't have the code in a more usable form; when my then-employer died, most of my files went with it. You can save the file as text, though! The structure of the code is intact, it's simply that your browswer squashes out the spaces when displaying it. Nuke the <P> at the start of each code line and what remains is very close to what was originally posted.
You *used* to know that <wink>! Thought you even got StevenM's old code from him a year or so ago. He went most of the way, up until hitting the C<->Python stack intertwingling barrier, and then dropped it. Plus Guido wrote generator.py to shut me up, which works, but is about 3x clumsier to use and runs about 50x slower than a generator should <wink>.
Excellent! Running off to a movie & dinner now, but will give a more careful reading tonight. co-dependent-ly y'rs - tim
data:image/s3,"s3://crabby-images/691b7/691b7585f53b413eda0d2fc54ab00faea46f4db3" alt=""
Christian Tismer <tismer@appliedbiometrics.com> wrote:
http://www.egroups.com/group/python-list/info.html has it all (almost), starting in 1991. </F>
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
Thoughts o' the day: + Generators ("semi-coroutines") are wonderful tools and easy to implement without major changes to the PVM. Icon calls 'em generators, Sather calls 'em iterators, and they're exactly what you need to implement "for thing in object:" when object represents a collection that's tricky to materialize. Python needs something like that. OTOH, generators are pretty much limited to that. + Coroutines are more general but much harder to implement, because each coroutine needs its own stack (a generator only has one stack *frame*-- its own --to worry about), and C-calling-Python can get into the act. As Sam said, they're probably no easier to implement than call/cc (but trivial to implement given call/cc). + What may be most *natural* is to forget all that and think about a variation of Python threads implemented directly via the interpreter, without using OS threads. The PVM already knows how to handle thread-state swapping. Given Christian's stackless interpreter, and barring C->Python cases, I suspect Python can fake threads all by itself, in the sense of interleaving their executions within a single "real" (OS) thread. Given the global interpreter lock, Python effectively does only-one-at-a-time anyway. Threads are harder than generators or coroutines to learn, but A) Many more people know how to use them already. B) Generators and coroutines can be implemented using (real or fake) threads. C) Python has offered threads since the beginning. D) Threads offer a powerful mode of control transfer coroutines don't, namely "*anyone* else who can make progress now, feel encouraged to do so at my expense". E) For whatever reasons, in my experience people find threads much easier to learn than call/cc -- perhaps because threads are *obviously* useful upon first sight, while it takes a real Zen Experience before call/cc begins to make sense. F) Simulated threads could presumably produce much more informative error msgs (about deadlocks and such) than OS threads, so even people using real threads could find excellent debugging use for them. Sam doesn't want to use "real threads" because they're pigs; fake threads don't have to be. Perhaps x = y.SOME_ASYNC_CALL(r, s, t) could map to e.g. import config if config.USE_REAL_THREADS: import threading else: from simulated_threading import threading from config.shared import msg_queue class Y: def __init__(self, ...): self.ready = threading.Event() ... def SOME_ASYNC_CALL(self, r, s, t): result = [None] # mutable container to hold the result msg_queue.put((server_of_the_day, r, s, t, self.ready, result)) self.ready.wait() self.ready.clear() return result[0] where some other simulated thread polls the msg_queue and does ready.set() when it's done processing the msg enqueued by SOME_ASYNC_CALL. For this to scale nicely, it's probably necessary for the PVM to cooperate with the simulated_threading implementation (e.g., a simulated thread that blocks (like on self.ready.wait()) should be taken out of the collection of simulated threads the PVM may attempt to resume -- else in Sam's case the PVM would repeatedly attempt to wake up thousands of blocked threads, and things would slow to a crawl). Of course, simulated_threading could be built on top of call/cc or coroutines too. The point to making threads the core concept is keeping Guido's brain from exploding. Plus, as above, you can switch to "real threads" by changing an import statement. making-sure-the-global-lock-support-hair-stays-around-even-if-greg- renders-it-moot-for-real-threads<wink>-ly y'rs - tim
data:image/s3,"s3://crabby-images/d3fe4/d3fe4247fa540cafafc59b32002fbfea3eed8b3a" alt=""
Mark Hammond wrote:
Sure. Since the frame holds a pointer to the code, and the current IP and SP, your code can easily change it (with care, or GPF:) . It could even create a fresh code object and let it run only for the running instance. By instance, I mean a frame which is running a code object.
I think this has been difficult, only since information was hiding in the inner interpreter loop. Gonna change now. ciao - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/d3fe4/d3fe4247fa540cafafc59b32002fbfea3eed8b3a" alt=""
Mark Hammond wrote:
I'm writing a prototype of a stackless Python, which means that you will be able to access the current state of the interpreter completely. The inner interpreter loop will be isolated from the frame dispatcher. It will break whenever the ticker goes zero. If you set the ticker to one, you will be able to single step on every opcode, have the value stack, the frame chain, everything. I think, with this you can do very much. But tell me if you want a callback hook somewhere. ciao - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/2750e/2750e63c84b199213156f78d970da1f5b8cd75be" alt=""
I think the main point is how to change code when a Python frame already references it. I dont think the structure of the frames is as important as the general concept. But while we were talking frame-fiddling it seemed a good point to try and hijack it a little :-) Would it be possible to recompile just a block of code (eg, just the current function or method) and patch it back in such a way that the current frame continues execution of the new code? I feel this is somewhat related to the inability to change class implementation for an existing instance. I know there have been hacks around this before but they arent completly reliable and IMO it would be nice if the core Python made it easier to change already running code - whether that code is in an existing stack frame, or just in an already created instance, it is very difficult to do. This has come to try and deflect some conversation away from changing Python as such towards an attempt at enhancing its _environment_. To paraphrase many people before me, even if we completely froze the language now there would still plenty of work ahead of us :-) Mark.
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
This topic sounds mostly unrelated to the stackless discussion -- in either case you need to be able to fiddle the contents of the frame and the bytecode pointer to reflect the changed function. Some issues: - The slots containing local variables may be renumbered after recompilation; fortunately we know the name--number mapping so we can move them to their new location. But it is still tricky. - Should you be able to edit functions that are present on the call stack below the top? Suppose we have two functions: def f(): return 1 + g() def g(): return 0 Suppose set a break in g(), and then edit the source of f(). We can do all sorts of evil to f(): e.g. we could change it to return g() + 2 which affects the contents of the value stack when g() returns (originally, the value stack contained the value 1, now it is empty). Or we could even change f() to return 3 thereby eliminating the call to g() altogether! What kind of limitations do other systems that support modifying a "live" program being debugged impose? Only allowing modification of the function at the top of the stack might eliminate some problems, although there are still ways to mess up. The value stack is not always empty even when we only stop at statement boundaries -- e.g. it contains 'for' loop indices, and there's also the 'block' stack, which contains try-except information. E.g. what should happen if we change def f(): for i in range(10): print 1 stopped at the 'print 1' into def f(): print 1 ??? (Ditto for removing or adding a try/except block.)
I've been thinking a bit about this. Function objects now have mutable func_code attributes (and also func_defaults), I think we can use this. The hard part is to do the analysis needed to decide which functions to recompile! Ideally, we would simply edit a file and tell the programming environment "recompile this". The programming environment would compare the changed file with the old version that it had saved for this purpose, and notice (for example) that we changed two methods of class C. It would then recompile those methods only and stuff the new code objects in the corresponding function objects. But what would it do when we changed a global variable? Say a module originally contains a statement "x = 0". Now we change the source code to say "x = 100". Should we change the variable x? Suppose that x is modified by some of the computations in the module, and the that, after some computations, the actual value of x was 50. Should the "recompile" reset x to 100 or leave it alone? One option would be to actually change the semantics of the class and def statements so that they modify an existing class or function rather than using assignment. Effectively, this proposal would change the semantics of class A: ...some code... class A: ...some more code... to be the same as class A: ...more code... ...some more code... This is somewhat similar to the way the module or package commands in some other dynamic languages work, I think; and I don't think this would break too much existing code. The proposal would also change def f(): ...some code... def f(): ...other code... but here the equivalence is not so easy to express, since I want different semantics (I don't want the second f's code to be tacked onto the end of the first f's code). If we understand that def f(): ... really does the following: f = NewFunctionObject() f.func_code = ...code object... then the construct above (def f():... def f(): ...) would do this: f = NewFunctionObject() f.func_code = ...some code... f.func_code = ...other code... i.e. there is no assignment of a new function object for the second def. Of course if there is a variable f but it is not a function, it would have to be assigned a new function object first. But in the case of def, this *does* break existing code. E.g. # module A from B import f . . . if ...some test...: def f(): ...some code... This idiom conditionally redefines a function that was also imported from some other module. The proposed new semantics would change B.f in place! So perhaps these new semantics should only be invoked when a special "reload-compile" is asked for... Or perhaps the programming environment could do this through source parsing as I proposed before...
Please, no more posts about Scheme. Each new post mentioning call/cc makes it *less* likely that something like that will ever be part of Python. "What if Guido's brain exploded?" :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/d4610/d4610167fb99aff56ebc2d699165eebfb614c9c5" alt=""
Guido> What kind of limitations do other systems that support modifying Guido> a "live" program being debugged impose? Only allowing Guido> modification of the function at the top of the stack might Guido> eliminate some problems, although there are still ways to mess Guido> up. Frame objects maintain pointers to the active code objects, locals and globals, so modifying a function object's code or globals shouldn't have any effect on currently executing frames, right? I assume frame objects do the usual INCREF/DECREF dance, so the old code object won't get deleted before the frame object is tossed. Guido> But what would it do when we changed a global variable? Say a Guido> module originally contains a statement "x = 0". Now we change Guido> the source code to say "x = 100". Should we change the variable Guido> x? Suppose that x is modified by some of the computations in the Guido> module, and the that, after some computations, the actual value Guido> of x was 50. Should the "recompile" reset x to 100 or leave it Guido> alone? I think you should note the change for users and give them some way to easily pick between old initial value, new initial value or current value. Guido> Please, no more posts about Scheme. Each new post mentioning Guido> call/cc makes it *less* likely that something like that will ever Guido> be part of Python. "What if Guido's brain exploded?" :-) I agree. I see call/cc or set! and my eyes just glaze over... Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583
data:image/s3,"s3://crabby-images/2750e/2750e63c84b199213156f78d970da1f5b8cd75be" alt=""
[Guido writes...]
This topic sounds mostly unrelated to the stackless discussion -- in
Sure is - I just saw that as an excuse to try and hijack it <wink>
Some issues:
- The slots containing local variables may be renumbered after
Generally, I think we could make something very useful even with a number of limitations. For example, I would find a first cut completely acceptable and a great improvement on today if: * Only the function at the top of the stack can be recompiled and have the code reflected while executing. This function also must be restarted after such an edit. If the function uses global variables or makes calls that restarting will screw-up, then either a) make the code changes _before_ doing this stuff, or b) live with it for now, and help us remove the limitation :-) That may make the locals being renumbered easier to deal with, and also remove some of the problems you discussed about editing functions below the top.
What kind of limitations do other systems that support modifying a "live" program being debugged impose? Only allowing modification of
I can only speak for VC, and from experience at that - I havent attempted to find documentation on it. It accepts most changes while running. The current line is fine. If you create or change the definition of globals (and possibly even the type of locals?), the "incremental compilation" fails, and you are given the option of continuing with the old code, or stopping the process and doing a full build. When the debug session terminates, some link process (and maybe even compilation?) is done to bring the .exe on disk up to date with the changes. If you do wierd stuff like delete the line being executed, it usually gives you some warning message before either restarting the function or trying to pick a line somewhere near the line you deleted. Either way, it can screw up, moving the "current" line somewhere else - it doesnt crash the debugger, but may not do exactly what you expected. It is still a _huge_ win, and a great feature! Ironically, I turn this feature _off_ for Python extensions. Although changing the C code is great, in 99% of the cases I also need to change some .py code, and as existing instances are affected I need to restart the app anyway - so I may as well do a normal build at that time. ie, C now lets me debug incrementally, but a far more dynamic language prevents this feature being useful ;-)
If we forced a restart would this be better? Can we reliably reset the stack to the start of the current function?
If this would work for the few changed functions/methods, what would the impact be of doing it for _every_ function (changed or not)? Then the analysis can drop to the module level which is much easier. I dont think a slight performace hit is a problem at all when doing this stuff.
Or extending this (didnt this come up at the latest IPC?) # .\package\__init__.py class BigMutha: pass # .\package\something.py class package.BigMutha: def some_category_of_methods(): ... # .\package\other.py class package.BigMutha: def other_category_of_methods(): ... [Of course, this wont fly as it stands; just a conceptual possibility]
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
OK, restarting the function seems a reasonable compromise and would seem relatively easy to implement. Not *real* easy though: it turns out that eval_code2() is called with a code object as argument, and it's not entirely trivial to figure out the corresponding function object from which to grab the new code object. But it could be done -- give it a try. (Don't wait for me, I'm ducking for cover until at least mid June.)
I hear you.
If we forced a restart would this be better? Can we reliably reset the stack to the start of the current function?
Yes, no problem.
Yes, this would be fine too.
Have no fear. I've learned to say no. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[GvR]
As an ex-compiler guy, I should have something wise to say about that. Alas, I've never used a system that allowed more than poking new values into vrbls, and the thought of any more than that makes me vaguely ill! Oh, that's right -- I'm vaguely ill anyway today. Still-- oooooh -- the problems. This later got reduced to restarting the topmost function from scratch. That has some attraction, especially on the bang-for-buck-o-meter.
What a pussy <wink>. Really, overall continuations are much less trouble to understand than threads -- there's only one function in the entire interface! OK. So how do you feel about coroutines? Would sure be nice to have *some* way to get pseudo-parallel semantics regardless of OS. changing-code-on-the-fly-==-mutating-the-current-continuation-ly y'rs - tim
data:image/s3,"s3://crabby-images/15fc2/15fc2aa489ef203f88c82c233466a905b703a2ca" alt=""
On Fri, 21 May 1999, Tim Peters wrote:
OK. So how do you feel about coroutines? Would sure be nice to have *some* way to get pseudo-parallel semantics regardless of OS.
I read about coroutines years ago on c.l.py, but I admit I forgot it all. Can you explain them briefly in pseudo-python? --david
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Tim]
OK. So how do you feel about coroutines? Would sure be nice to have *some* way to get pseudo-parallel semantics regardless of OS.
[David Ascher]
I read about coroutines years ago on c.l.py, but I admit I forgot it all. Can you explain them briefly in pseudo-python?
How about real Python? http://www.python.org/tim_one/000169.html contains a complete coroutine implementation using threads under the covers (& exactly 5 years old tomorrow <wink>). If I were to do it over again, I'd use a different object interface (making coroutines objects in their own right instead of funneling everything through a "coroutine controller" object), but the ideas are the same in every coroutine language. The post contains several executable examples, from simple to "literature standard". I had forgotten all about this: it contains solutions to the same "compare tree fringes" problem Sam mentioned, *and* the generator-based building block I posted three other solutions for in this thread. That last looks like: # fringe visits a nested list in inorder, and detaches for each non-list # element; raises EarlyExit after the list is exhausted def fringe( co, list ): for x in list: if type(x) is type([]): fringe(co, x) else: co.detach(x) def printinorder( list ): co = Coroutine() f = co.create(fringe, co, list) try: while 1: print co.tran(f), except EarlyExit: pass print printinorder([1,2,3]) # 1 2 3 printinorder([[[[1,[2]]],3]]) # ditto x = [0, 1, [2, [3]], [4,5], [[[6]]] ] printinorder(x) # 0 1 2 3 4 5 6 Generators are really "half a coroutine", so this doesn't show the full power (other examples in the post do). co.detach is a special way to deal with this asymmetry. In the general case you use co.tran all the time, where (see the post for more info) v = co.tran(c [, w]) means "resume coroutine c from the place it last did a co.tran, optionally passing it the value w, and when somebody does a co.tran back to *me*, resume me right here, binding v to the value *they* pass to co.tran ). Knuth complains several times that it's very hard to come up with a coroutine example that's both simple and clear <0.5 wink>. In a nutshell, coroutines don't have a "caller/callee" relationship, they have "we're all equal partners" relationship, where any coroutine is free to resume any other one where it left off. It's no coincidence that making coroutines easy to use was pioneered by simulation languages! Just try simulating a marriage where one partner is the master and the other a slave <wink>. i-may-be-a-bachelor-but-i-have-eyes-ly y'rs - tim
data:image/s3,"s3://crabby-images/d3fe4/d3fe4247fa540cafafc59b32002fbfea3eed8b3a" alt=""
Tim Peters wrote:
What an interesting thread! Unfortunately, all the examples are messed up since some HTML formatter didn't take care of the python code, rendering it unreadable. Is there a different version available? Also, I'd like to read the rest of the threads in http://www.python.org/tim_one/ but it seems that only your messages are archived? Anyway, the citations in http://www.python.org/tim_one/000146.html show me that you have been through all of this five years ago, with a five years younger Guido which sounds a bit different than today. I had understood him better if I had known that this is a re-iteration of a somehow dropped or entombed idea. (If someone has the original archives from that epoche, I'd be happy to get a copy. Actually, I'm missing all upto end of 1996.) A sort snapshot: Stackless Python is meanwhile nearly alive, with recursion avoided in ceval. Of course, some modules are left which still need work, but enough for a prototype. Frames contain now all necessry state and are now prepared for execution and thrown back to the evaluator (elevator?). The key idea was to change the deeply nested functions in a way, that their last eval_code call happens to be tail recursive. In ceval.c (and in other not yet changed places), functions to a lot of preparation, build some parameter, call eval_code and release the parameter. This was the crux, which I solved by a new filed in the frame object, where such references can be stored. The routine can now return with the ready packaged frame, instead of calling it. As a minimum facility for future co-anythings, I provided a hook function for resuming frames, which causes no overhead in the usual case but allows to override what a frame does when someone returns control to it. To implement this is due to some extension module, wether this may be coroutines or your nice nano-threads, it's possible. threadedly yours - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Christian]
Yes, that link is the old Pythonic Award Shrine erected in my memory -- it's all me, all the time, no mercy, no escape <wink>. It predates the DejaNews archive, but the context can still be found in http://www.python.org/search/hypermail/python-1994q2/index.html There's a lot in that quarter about continuations & coroutines, most from Steven Majewski, who took a serious shot at implementing all this. Don't have the code in a more usable form; when my then-employer died, most of my files went with it. You can save the file as text, though! The structure of the code is intact, it's simply that your browswer squashes out the spaces when displaying it. Nuke the <P> at the start of each code line and what remains is very close to what was originally posted.
You *used* to know that <wink>! Thought you even got StevenM's old code from him a year or so ago. He went most of the way, up until hitting the C<->Python stack intertwingling barrier, and then dropped it. Plus Guido wrote generator.py to shut me up, which works, but is about 3x clumsier to use and runs about 50x slower than a generator should <wink>.
Excellent! Running off to a movie & dinner now, but will give a more careful reading tonight. co-dependent-ly y'rs - tim
data:image/s3,"s3://crabby-images/691b7/691b7585f53b413eda0d2fc54ab00faea46f4db3" alt=""
Christian Tismer <tismer@appliedbiometrics.com> wrote:
http://www.egroups.com/group/python-list/info.html has it all (almost), starting in 1991. </F>
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
Thoughts o' the day: + Generators ("semi-coroutines") are wonderful tools and easy to implement without major changes to the PVM. Icon calls 'em generators, Sather calls 'em iterators, and they're exactly what you need to implement "for thing in object:" when object represents a collection that's tricky to materialize. Python needs something like that. OTOH, generators are pretty much limited to that. + Coroutines are more general but much harder to implement, because each coroutine needs its own stack (a generator only has one stack *frame*-- its own --to worry about), and C-calling-Python can get into the act. As Sam said, they're probably no easier to implement than call/cc (but trivial to implement given call/cc). + What may be most *natural* is to forget all that and think about a variation of Python threads implemented directly via the interpreter, without using OS threads. The PVM already knows how to handle thread-state swapping. Given Christian's stackless interpreter, and barring C->Python cases, I suspect Python can fake threads all by itself, in the sense of interleaving their executions within a single "real" (OS) thread. Given the global interpreter lock, Python effectively does only-one-at-a-time anyway. Threads are harder than generators or coroutines to learn, but A) Many more people know how to use them already. B) Generators and coroutines can be implemented using (real or fake) threads. C) Python has offered threads since the beginning. D) Threads offer a powerful mode of control transfer coroutines don't, namely "*anyone* else who can make progress now, feel encouraged to do so at my expense". E) For whatever reasons, in my experience people find threads much easier to learn than call/cc -- perhaps because threads are *obviously* useful upon first sight, while it takes a real Zen Experience before call/cc begins to make sense. F) Simulated threads could presumably produce much more informative error msgs (about deadlocks and such) than OS threads, so even people using real threads could find excellent debugging use for them. Sam doesn't want to use "real threads" because they're pigs; fake threads don't have to be. Perhaps x = y.SOME_ASYNC_CALL(r, s, t) could map to e.g. import config if config.USE_REAL_THREADS: import threading else: from simulated_threading import threading from config.shared import msg_queue class Y: def __init__(self, ...): self.ready = threading.Event() ... def SOME_ASYNC_CALL(self, r, s, t): result = [None] # mutable container to hold the result msg_queue.put((server_of_the_day, r, s, t, self.ready, result)) self.ready.wait() self.ready.clear() return result[0] where some other simulated thread polls the msg_queue and does ready.set() when it's done processing the msg enqueued by SOME_ASYNC_CALL. For this to scale nicely, it's probably necessary for the PVM to cooperate with the simulated_threading implementation (e.g., a simulated thread that blocks (like on self.ready.wait()) should be taken out of the collection of simulated threads the PVM may attempt to resume -- else in Sam's case the PVM would repeatedly attempt to wake up thousands of blocked threads, and things would slow to a crawl). Of course, simulated_threading could be built on top of call/cc or coroutines too. The point to making threads the core concept is keeping Guido's brain from exploding. Plus, as above, you can switch to "real threads" by changing an import statement. making-sure-the-global-lock-support-hair-stays-around-even-if-greg- renders-it-moot-for-real-threads<wink>-ly y'rs - tim
data:image/s3,"s3://crabby-images/d3fe4/d3fe4247fa540cafafc59b32002fbfea3eed8b3a" alt=""
Mark Hammond wrote:
Sure. Since the frame holds a pointer to the code, and the current IP and SP, your code can easily change it (with care, or GPF:) . It could even create a fresh code object and let it run only for the running instance. By instance, I mean a frame which is running a code object.
I think this has been difficult, only since information was hiding in the inner interpreter loop. Gonna change now. ciao - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
participants (7)
-
Christian Tismer
-
David Ascher
-
Fredrik Lundh
-
Guido van Rossum
-
Mark Hammond
-
Skip Montanaro
-
Tim Peters