I'll try to sketch here the scheme I'm thinking of for the callback/breakpoint issue (without SET_LINENO), although some technical details are still missing. I'm assuming the following, in this order: 1) No radical changes in the current behavior, i.e. preserve the current architecture / strategy as much as possible. 2) We dont have breakpoints per opcode, but per source line. For that matter, we have sys.settrace (and for now, we don't aim to have sys.settracei that would be called on every opcode, although we might want this in the future) 3) SET_LINENO disappear. Actually, SET_LINENO are conditional breakpoints, used for callbacks from C to Python. So the basic problem is to generate these callbacks. If any of the above is not an appropriate assumption and we want a radical change in the strategy of setting breakpoints/ generating callbacks, then this post is invalid. The solution I'm thinking of: a) Currently, we have a function PyCode_Addr2Line which computes the source line from the opcode's address. I hereby assume that we can write the reverse function PyCode_Line2Addr that returns the address from a given source line number. I don't have the implementation, but it should be doable. Furthermore, we can compute, having the co_lnotab table and co_firstlineno, the source line range for a code object. As a consequence, even with the dumbiest of all algorithms, by looping trough this source line range, we can enumerate with PyCode_Line2Addr the sequence of addresses for the source lines of this code object. b) As Chris pointed out, in case sys.settrace is defined, we can allocate and keep a copy of the original code string per frame. We can further dynamically overwrite the original code string with a new (internal, one byte) CALL_TRACE opcode at the addresses we have enumerated in a). The CALL_TRACE opcodes will trigger the callbacks from C to Python, just as the current SET_LINENO does. c) At execution time, whenever a CALL_TRACE opcode is reached, we trigger the callback and if it returns successfully, we'll fetch the original opcode for the current location from the copy of the original co_code. Then we directly jump to the arg fetch code (or in case we fetch the entire original opcode in CALL_TRACE - we jump to the dispatch code). Hmm. I think that's all. At the heart of this scheme is the PyCode_Line2Addr function, which is the only blob in my head, for now. Christian Tismer wrote:
I didn't think of this before, but I just realized that I have something like that already in Stackless Python. It is possible to set a breakpoint at every opcode, for every frame. Adding an extra opcode for breakpoints is a good thing as well. The former are good for tracing, conditionla breakpoints and such, and cost a little more time since the is always one extra function call. The latter would be a quick, less versatile thing.
I don't think I understand clearly the difference you're talking about, and why the one thing is better that the other, probably because I'm a bit far from stackless python.
I'm going to finish and publish the stackless/continous package and submit a paper by end of September. Should I include this debugging feature?
Write the paper first, you have more than enough material to talk about already ;-). Then if you have time to implement some debugging support, you could always add another section, but it won't be a central point of your paper. -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252