[docs] [issue34648] Confirm the types of parameters of traceback.format_list and traceback.StackSummary.from_list post-3.5

24 Sep 2018


      Nathaniel Manista  added the comment:
...
The question that needs to be answered here is "why we should support
Iterable[FrameSummary]?" and you're one the one who needs to answer
it
Okay, here are a few reasons:
1) A function that requires that an input be a List invites the question “why is an Iterable, or a Collection, or a Sequence, not sufficient?”. Of course the best way to handle that question is to widen the API to accept at least a Sequence so that the question isn’t even asked, but the next-best way to handle the question is with an answer like “because the function mutates the given List”. But that’s not the answer to the question here (which is a big part of why I opened the bug, asking the question more directly). So far the answer seems to be “because we wanted to be conservative in our API design” and “because we didn’t want to support what we weren’t testing”. Those are good answers, but I’d like to persuade you that they are no longer good enough.
2) Because List is invariant in element type whereas Sequence, Collection, and Iterable are covariant (I'm nearly certain) in element type, format_list is hard to use with type-checked code (see the discussion in https://github.com/python/typeshed/pull/2436 if you’re not following it already). Should the typeshed-canonicalized type of format_list’s “extracted_list” parameter be “List[FrameSummary]”? If so, then given a subclass MyFrameSummarySubclass of FrameSummary, List[MyFrameSummarySubclass] would *not* be valid to pass to format_list. That seems like a poor developer experience, right? If the type of “extracted_list” were widened to Sequence[FrameSummary], Collection[FrameSummary], or (ideally) Iterable[FrameSummary], then an expression of type Iterable[MyFrameSummarySubclass] would be perfectly fine to pass for “extracted_list”.
3) I tried to pass an Iterable[FrameSummary] to traceback.format_list in my own code. My situation is not itself open source but is a bit like this:

    def _FormatFrames(frame) -> str:
        """Generate a traceback of the current position of the thread in the code."""
        frame_tuples = []
        while frame:
            filename = <censored>
            lineno = frame.f_lineno
            line = <censored>
            frame_tuples.append((filename, lineno, frame.f_code.co_name, line,))
            frame = frame.f_back
        formatted_frame_strings = traceback.format_list(
            traceback.FrameSummary(file_name, line_number, frame_name, line=line)
            for file_name, line_number, frame_name, line in frame_tuples)
        return ''.join(formatted_frame_strings)

.
...
instead of saying our design decisions were "absurd" and "not wise",
without giving any concrete use cases for your "wider" design choice.
According to what we’ve learned over the course of the life and use of Python, and according to what we understand as best practices in the year 2018: I believe it is absolutely unwise to have offer an API that requires that a given parameter be a List without documenting the reasons sustaining that extremely limited design. Why won’t a Sequence suffice? Heck, why won’t some other implementation of MutableSequence suffice? Notice that I opened this bug with the question “wait, really?” rather than the statement “this is wrong and must be changed” - it might be the case that List is the best choice for the type of the parameter (though I now doubt it even more) but certainly it’s clear that *the documentation does not establish or communicate why a List is required*.
...
Initial docstring for format_list() were added in 2000:
https://github.com/python/cpython/commit/e7b146fb3bdc#diff-e57ff53a569d0ebbe...
Given a list of tuples as returned by extract_tb() or
    extract_stack(), return a list of strings ready for printing.
I have no doubt that that’s what was written in 2000, when we didn’t know nearly as much as we do today and when what were considered best practices were radically different. Now we know more - particularly for this case “never ask for a List where asking for a MutableSequence will do, and never ask for a MutableSequence where asking for a Sequence will do, and never ask for a Sequence where asking for a Collection will do, and never ask for a Collection where asking for an Iterable will do”. The big question for format_list is “will an Iterable do?”. If not, that’s strange and unique by today’s standards and practices and the function’s documentation should at least give some enlightenment as to why it is the case.
...
There are no tests for iterables other than list at
https://github.com/python/cpython/blob/master/Lib/test/test_traceback.py
I’d be happy to add some… except test_traceback.py doesn’t appear to contain any direct testing of format_list, and I can’t find “the format_list tests” to which it alludes (https://github.com/python/cpython/search?q=format_list&unscoped_q=format_list). Is there any direct call to traceback.format_list under Lib/test? Where should tests of passing an Iterable to format_list be added?
...
So we won't change documentation and docstrings, add more tests without seeing concrete
use cases.
I wouldn’t be here if I were personally affected in my own use case, and as more and more users start statically verifying their code with mypy and pytype, more and more of them will be affected. This problem is worth solving.
...
I meant an iterable that is extracted from a traceback object manually without
using extract_tb() or extract_stack() functions (by using a custom function, an
external dependency, or HTTP API) How people can get
Iterable[FrameSummary] as an input and pass it to format_list()?
FrameSummary is a public class with a public constructor. Anyone can construct instances of it for any reason and aggregate them in any Iterable they please; the question before us is whether or not format_list should accept any such Iterable or should insist that callers re-aggregate their FrameSummarys in a List. I’ve only described here the only use case that I happen personally to have encountered, but I think it’s worth bearing in mind that *public code elements invite public use*, so we shouldn’t be surprised when FrameSummary and format_list get used for purposes for which they appear well-suited, but which weren’t necessarily envisioned.
...
Are you serious?
Yes.
...
No, we've introduced the new API and spent weeks designing it just
to play with people. They should stick with the old one for no reason.
Thank you for this very strong affirmation, which is more constructive than I suspect you meant it to be. Note that the documentation of format_list doesn’t mention a preference for passing FrameSummarys over passing Tuples, but that may simply be a consequence of not writing the documentation with passing of caller-constructed-values-that-were-not-returned-by-extract_tb-or-extract_stack in mind. Over in https://github.com/python/typeshed/pull/2436 there’s some back-and-forth over what the *element type* of “extracted_list” should be, and I’m in favor of going the extra mile and making it FrameSummary in 3.5-and-later rather than Union[Tuple[str, int, str, Optional[str]], FrameSummary]. It’s reasonable to say that if an author is being diligent enough to type-check their code, we can presume that they are diligent enough to use an API according to current best practices rather than making use of a backwards compatibility carve-out, right? It sounds like you might feel similarly?

----------

_______________________________________
Python tracker 
https://bugs.python.org/issue34648
_______________________________________