On Fri, Jun 26, 2020 at 6:42 AM Mark Shannon <mark@hotpy.org> wrote:

> Let us start from some anecdotal evidence: isinstance() is one of the most called functions in large scale Python code-bases (by static call count). In particular, when analyzing some multi-million line production code base, it was discovered that isinstance() is the second most called builtin function (after len()). Even taking into account builtin classes, it is still in the top ten. Most of such calls are followed by specific attribute access.

Why use anecdotal evidence? I don't doubt the numbers, but it would be
better to use the standard library, or the top N most popular packages
from GitHub.

Agreed. This anecdote felt off to me and made for a bad introductory feeling. I know enough of who is involved to read it as likely "within the internal Dropbox code base we found isinstance() to be the second most called built-in by static call counts". It'd be better worded as such instead of left opaque if you are going to use this example at all. [but read on below, i'm not sure the anecdotal evidence is even relevant to state]

Also if using this, please include text explaining what "static call count means". Was that "number of grep 'isinstance[(]' matches in all .py files which we reasonably assume are calls"? Or was that "measuring a running application and counting cumulative calls of every built-in for the lifetime of the large application"? Include a footnote of if you have you removed all use of six and py2->py3-isms? Both six and manual py2->3 porting often wound up adding isinstance in places where they'll rightfully be refactored out when cleaning up the py2 dead code legacy becomes anyones priority.

A very rough grep of our much larger Python codebase within Google shows isinstance call site counts to likely be lower than int or len and similar to print. With a notable percentage of isinstance usage clearly related to py2 -> py3 compatibility, suggesting many can now go away. I'm not going to spend much time looking further as I don't think actual numbers matter: Confirmed, isinstance gets used a lot. We can simply state that as a truth and move on without needing a lot of justification.

>
> There are two possible conclusions that can be drawn from this information:
>
> Handling of heterogeneous data (i.e. situations where a variable can take values of multiple types) is common in real world code.
> Python doesn't have expressive ways of destructuring object data (i.e. separating the content of an object into multiple variables).

I don't see how the second conclusion can be drawn.
How does the prevalence of `isinstance()` suggest that Python doesn't
have expressive ways of destructuring object data?

...

>
> We believe this will improve both readability and reliability of relevant code. To illustrate the readability improvement, let us consider an actual example from the Python standard library:
>
> def is_tuple(node):
> if isinstance(node, Node) and node.children == [LParen(), RParen()]:
> return True
> return (isinstance(node, Node)
> and len(node.children) == 3
> and isinstance(node.children[0], Leaf)
> and isinstance(node.children[1], Node)
> and isinstance(node.children[2], Leaf)
> and node.children[0].value == "("
> and node.children[2].value == ")")
>

Just one example?
The PEP needs to show that this sort of pattern is widespread.

Agreed. I don't find application code following this pattern to be common. Yes it exists, but I would not expect to encounter it frequently if I were doing random people's Python code reviews.

The supplied "stdlib" code example is lib2to3.fixer_util.is_tuple. Using that as an example of code "in the standard library" is technically correct. But lib2to3 is an undocumented deprecated library that we have slated for removal. That makes it a bit weak to cite.

Better practical examples don't have to be within the stdlib.

Randomly perusing some projects I know that I expect to have such constructs, here's a possible example: https://github.com/PyCQA/pylint/blob/master/pylint/checkers/logging.py#L231.

There are also code patterns in pytype such as https://github.com/google/pytype/blob/master/pytype/vm.py#L480 and https://github.com/google/pytype/blob/master/pytype/vm.py#L1088 that might make sense.

Though I realize you were probably in search of a simple one for the PEP in order to write a before and after example.

-gps