Extending traceback to (optionally) format and show locals.

For context please see http://bugs.python.org/issue22937 and http://bugs.python.org/issue22936.
I have two questions I'm hoping to get answered through this thread: - does the change in question need a PEP? Antoine seemed to think it didn't, as long as it was opt-in (via the unittest CLI etc) - Is my implementation approach sound (for traceback, unittest I think I have covered :))?
Implementation wise, I think its useful to work in the current traceback module layout - that is to alter extract_stack to (optionally) include rendered data about locals and then look for that and format it in format_list.
I'm sure there is code out there that depends on the quadruple nature of extract_stack though, so I think we need to preserve that. Three strategies occured to me; one is to have parallel functions, one quadruple, one quintuple. A second one is to have the return value of extract_stack be a quintuple when a new keyword parameter include_locals is included. Lastly, and this is my preferred one, if we return a tuple subclass with an attribute containing a dict with the rendered data on the locals; this can be present but None, or even just absent when extract_stack was not asked to include locals.
The last option is my preferred one because the other two both imply having a data structure which is likely to break existing code - and while you'd have to opt into having them it seems likely to require a systematic set of upgrades vs having an optional attribute that can be looked up.
So - thoughts?
-Rob

On Nov 26, 2014, at 15:45, Robert Collins robertc@robertcollins.net wrote:
For context please see http://bugs.python.org/issue22937 and http://bugs.python.org/issue22936.
I have two questions I'm hoping to get answered through this thread:
- does the change in question need a PEP? Antoine seemed to think it
didn't, as long as it was opt-in (via the unittest CLI etc)
- Is my implementation approach sound (for traceback, unittest I
think I have covered :))?
Implementation wise, I think its useful to work in the current traceback module layout - that is to alter extract_stack to (optionally) include rendered data about locals and then look for that and format it in format_list.
I'm sure there is code out there that depends on the quadruple nature of extract_stack though, so I think we need to preserve that. Three strategies occured to me; one is to have parallel functions, one quadruple, one quintuple. A second one is to have the return value of extract_stack be a quintuple when a new keyword parameter include_locals is included. Lastly, and this is my preferred one, if we return a tuple subclass with an attribute containing a dict with the rendered data on the locals; this can be present but None, or even just absent when extract_stack was not asked to include locals.
There are lots of other cases in the stdlib where something is usable as a tuple of n fields or as a structseq/namedtuple of >n fields: stat results, struct_tm, etc. So, why not do the same thing here?
The last option is my preferred one because the other two both imply having a data structure which is likely to break existing code - and while you'd have to opt into having them it seems likely to require a systematic set of upgrades vs having an optional attribute that can be looked up.
So - thoughts?
-Rob
-- Robert Collins rbtcollins@hp.com Distinguished Technologist HP Converged Cloud _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 27 November 2014 at 14:12, Andrew Barnert abarnert@yahoo.com wrote:
On Nov 26, 2014, at 15:45, Robert Collins robertc@robertcollins.net wrote:
I'm sure there is code out there that depends on the quadruple nature of extract_stack though, so I think we need to preserve that. Three strategies occured to me; one is to have parallel functions, one quadruple, one quintuple. A second one is to have the return value of extract_stack be a quintuple when a new keyword parameter include_locals is included. Lastly, and this is my preferred one, if we return a tuple subclass with an attribute containing a dict with the rendered data on the locals; this can be present but None, or even just absent when extract_stack was not asked to include locals.
There are lots of other cases in the stdlib where something is usable as a tuple of n fields or as a structseq/namedtuple of >n fields: stat results, struct_tm, etc. So, why not do the same thing here?
Because backwards compatibility. Moving to a namedtuple is fine - changing the length of the tuple is a problem.
-Rob

On Nov 27, 2014, at 16:48, Robert Collins robertc@robertcollins.net wrote:
On 27 November 2014 at 14:12, Andrew Barnert abarnert@yahoo.com wrote:
On Nov 26, 2014, at 15:45, Robert Collins robertc@robertcollins.net wrote:
I'm sure there is code out there that depends on the quadruple nature of extract_stack though, so I think we need to preserve that. Three strategies occured to me; one is to have parallel functions, one quadruple, one quintuple. A second one is to have the return value of extract_stack be a quintuple when a new keyword parameter include_locals is included. Lastly, and this is my preferred one, if we return a tuple subclass with an attribute containing a dict with the rendered data on the locals; this can be present but None, or even just absent when extract_stack was not asked to include locals.
There are lots of other cases in the stdlib where something is usable as a tuple of n fields or as a structseq/namedtuple of >n fields: stat results, struct_tm, etc. So, why not do the same thing here?
Because backwards compatibility. Moving to a namedtuple is fine - changing the length of the tuple is a problem.
That's the whole point: you're _not_ changing the length. Again, look at the examples that are already in the stdlib: they have a fixed length as a tuple, with extra fields accessible by name only. And it's dead easy to do with structseq.

On Thu, Nov 27, 2014, at 19:48, Robert Collins wrote:
On 27 November 2014 at 14:12, Andrew Barnert abarnert@yahoo.com wrote:
On Nov 26, 2014, at 15:45, Robert Collins robertc@robertcollins.net wrote:
I'm sure there is code out there that depends on the quadruple nature of extract_stack though, so I think we need to preserve that. Three strategies occured to me; one is to have parallel functions, one quadruple, one quintuple. A second one is to have the return value of extract_stack be a quintuple when a new keyword parameter include_locals is included. Lastly, and this is my preferred one, if we return a tuple subclass with an attribute containing a dict with the rendered data on the locals; this can be present but None, or even just absent when extract_stack was not asked to include locals.
There are lots of other cases in the stdlib where something is usable as a tuple of n fields or as a structseq/namedtuple of >n fields: stat results, struct_tm, etc. So, why not do the same thing here?
Because backwards compatibility. Moving to a namedtuple is fine - changing the length of the tuple is a problem.
Er, but what is being suggested is to do the same backwards-compatible thing: move to a namedtuple-like object with extra non-tuple fields, just like those others. I'm confused as to what is the conflict here.

On 1 December 2014 at 10:30, random832@fastmail.us wrote:
On Thu, Nov 27, 2014, at 19:48, Robert Collins wrote:
On 27 November 2014 at 14:12, Andrew Barnert abarnert@yahoo.com wrote:
On Nov 26, 2014, at 15:45, Robert Collins robertc@robertcollins.net wrote:
I'm sure there is code out there that depends on the quadruple nature of extract_stack though, so I think we need to preserve that. Three strategies occured to me; one is to have parallel functions, one quadruple, one quintuple. A second one is to have the return value of extract_stack be a quintuple when a new keyword parameter include_locals is included. Lastly, and this is my preferred one, if we return a tuple subclass with an attribute containing a dict with the rendered data on the locals; this can be present but None, or even just absent when extract_stack was not asked to include locals.
There are lots of other cases in the stdlib where something is usable as a tuple of n fields or as a structseq/namedtuple of >n fields: stat results, struct_tm, etc. So, why not do the same thing here?
Because backwards compatibility. Moving to a namedtuple is fine - changing the length of the tuple is a problem.
Er, but what is being suggested is to do the same backwards-compatible thing: move to a namedtuple-like object with extra non-tuple fields, just like those others. I'm confused as to what is the conflict here.
The thing I was missing is that Andrew was referring to a C only API - AFAICT there is no Python equivalent to PyStructSequence (other than implementing __len__ etc oneself - which is fine, but its not structseq then, AIUI. NamedTuple would imply changing the length - and there's no reason to reimplement traceback as C, so I'd rather not do that.
Anyhow, looks like there is a strong desire for a fresh API anyway in 17911, so I'm just going to do that.
-Rob

On 27 November 2014 at 09:45, Robert Collins robertc@robertcollins.net wrote:
For context please see http://bugs.python.org/issue22937 and http://bugs.python.org/issue22936.
Another useful bit of context is the current RFE to have a way to extract and save a traceback *summary* that omits the local variable details: http://bugs.python.org/issue17911
Our inclination to resolve that one was to design a new higher level traceback manipulation API, which seems relevant to this proposal as well.
I have two questions I'm hoping to get answered through this thread:
- does the change in question need a PEP? Antoine seemed to think it
didn't, as long as it was opt-in (via the unittest CLI etc)
I agree this isn't a PEP level change, just a normal RFE. That said, if the API design details get too confusing, a PEP may still end up being a useful way of clarifying specific details.
- Is my implementation approach sound (for traceback, unittest I
think I have covered :))?
Implementation wise, I think its useful to work in the current traceback module layout - that is to alter extract_stack to (optionally) include rendered data about locals and then look for that and format it in format_list.
For 17911, we're not so sure about that - there's a strong case to be made for exposing a new object-oriented API, rather than continuing with the C-style "records + functions that work on them" model that the traceback module currently uses.
We made similar additions over the last couple of releases for both inspect (via inspect.Signature & inspect.Parameter) and dis (via dis.ByteCode & dis.Instruction).
I'm sure there is code out there that depends on the quadruple nature of extract_stack though, so I think we need to preserve that. Three strategies occured to me; one is to have parallel functions, one quadruple, one quintuple. A second one is to have the return value of extract_stack be a quintuple when a new keyword parameter include_locals is included. Lastly, and this is my preferred one, if we return a tuple subclass with an attribute containing a dict with the rendered data on the locals; this can be present but None, or even just absent when extract_stack was not asked to include locals.
Fourth, expand the new ExceptionSummary API proposed in 17911 to also include the ability to *optionally* preserve the local variable data from the stack frame, rather than always discarding it.
The last option is my preferred one because the other two both imply having a data structure which is likely to break existing code - and while you'd have to opt into having them it seems likely to require a systematic set of upgrades vs having an optional attribute that can be looked up.
So - thoughts?
I don't see a lot of value in adhering too strictly to the current API design model - I think it would make more sense to design a new higher level API, and then look at including some elements to make it easy to adopt the new tools in existing code without having to rewrite the world (e.g. the inspect module work in Python 3.4 that switched the legacy APIs to actually call the new inspect.signature API internally, greatly expanding the scope of the types they could handle).
Cheers, Nick.

On 27 November 2014 at 16:08, Nick Coghlan ncoghlan@gmail.com wrote:
On 27 November 2014 at 09:45, Robert Collins robertc@robertcollins.net wrote:
For context please see http://bugs.python.org/issue22937 and http://bugs.python.org/issue22936.
Another useful bit of context is the current RFE to have a way to extract and save a traceback *summary* that omits the local variable details: http://bugs.python.org/issue17911
AIUI the specific desire is to allow a minimal cost capture of tracebacks without frames (so as to allow gc), with enough data to render a traceback later should the thing turn out to escape the system. So largely skipping linecache lookup etc.
I think thats a good thing to do :).
Our inclination to resolve that one was to design a new higher level traceback manipulation API, which seems relevant to this proposal as well.
....
- Is my implementation approach sound (for traceback, unittest I
think I have covered :))?
Implementation wise, I think its useful to work in the current traceback module layout - that is to alter extract_stack to (optionally) include rendered data about locals and then look for that and format it in format_list.
For 17911, we're not so sure about that - there's a strong case to be made for exposing a new object-oriented API, rather than continuing with the C-style "records + functions that work on them" model that the traceback module currently uses.
There's a nice functional feel there more than a C thing, IMO - in that there is no hidden state, and everything is laid out for direct view.
OTOH I've no objection to a more objects-and-methods feel, though we'll want a thunk through to the new code which will essentially just be constructing objects just-in-time. Seems straight forward enough to write though.
Fourth, expand the new ExceptionSummary API proposed in 17911 to also include the ability to *optionally* preserve the local variable data from the stack frame, rather than always discarding it.
A couple of other related things I should be clear about: I want something I can backport successfully via traceback2, for use in unittest2. I don't see any issue with the proposal so far, other than the linecache API change needed to support __loader__, which implies a linecache2 backport as well.
The last option is my preferred one because the other two both imply having a data structure which is likely to break existing code - and while you'd have to opt into having them it seems likely to require a systematic set of upgrades vs having an optional attribute that can be looked up.
So - thoughts?
I don't see a lot of value in adhering too strictly to the current API design model - I think it would make more sense to design a new higher level API, and then look at including some elements to make it easy to adopt the new tools in existing code without having to rewrite the world (e.g. the inspect module work in Python 3.4 that switched the legacy APIs to actually call the new inspect.signature API internally, greatly expanding the scope of the types they could handle).
Sure. AIUI noone is actively pushing on the new thing, so I'll put my hand up for it now and we'll see where I get to in my available cycles.
-Rob
participants (4)
-
Andrew Barnert
-
Nick Coghlan
-
random832ļ¼ fastmail.us
-
Robert Collins