On Thu, Jan 28, 2021 at 8:08 PM Terry Reedy <tjreedy@udel.edu> wrote:
On 1/27/2021 7:01 PM, Guido van Rossum wrote:
> It's likely that the additions are going to break someone's code;

Since at least 2007, when Georg moved the 3i reST tree into the 3k
branch, the instances have been described as "Class/Function Descriptor
Objects", returned by readmodule(_ex), that "provide the following data
members".  There are no methods (maybe __repr__ should be added for
debugging), so these are pure data classes, identical in purpose to pure
dataclasses.

Lots of people just read the source code though. In general we've often been careful with changing undocumented APIs if we believe they are commonly used in ways not endorsed by the official documentation.
 
The only intended reason to call these would be to create a tree for
testing. 

Really? Couldn't someone have an application where they insert new objects (like is documented for the ast module)?
 
We do this indirectly with a couple of private test-only
pyclbr functions.  Being private, *those* functions can have the new
parameter added anywhere, regardless of the Class/Function signatures.

test_pyclbr uses a constructed tree as the expected output from
readline_ex(test_code).  IDLE's test_browser uses a construction as test
input to the browser widget. If someone does similar testing, it could
break with the proposed changes.

So the question is whether we care. I don't know how to assess that.

> the
> question seems to be do we care, since the constructors are undocumented?

If any library functions return dataclasses, are the signatures of the
(possibly generated) __init__ functions documented?

I'd say so, because we document how `@dataclass` constructs the `__init__()` method.
 
I care more about breaking tests than possibly non-existent 'off-label'
uses, but Python won't know.

> But shouldn't we just document the signatures,

The existing or proposed signatures?

I meant the proposed signatures.

> so that in the future
> we're not going to make the same mistake?

I am not sure what you mean by the 'mistake' to not be repeated.  I
consider it be had been either not declaring the signature private or at
least not making optional parameters keyword only.

Yeah, by 'mistake' I meant leaving it ambiguous whether the constructor signature is part of the API. And yes, we could also have designed the constructor signatures more carefully.
 
> I also wonder if some judicious use of keyword-only args might help
> users who currently call the old constructors catch the breakage where
> it's easily diagnosed (i.e. at the call site) instead of once the object
> is instantiated (e.g. "why is end_lineno/parent the wrong type")?

With 'end_lineno' added after existing required parameters, just before
the existing optional parameter 'parent', an existing Class call ending
with '..., posint, parent=SomeClass)' would fail with

TypeError: Class.__init__ missing 1 required positional argument:
'end_lineno'

Unless you give end_lineno a default (I don't think we should do this unless we find significant rogue usages).
 
If parent had previously been made 'keyword only', a call ending instead
with '..., num, SomeClass' would be a failure now.  But since not,

Right. One of the mistakes.

> Perhaps even add some type checks to catch obvious mistakes (could be as
> simple as `assert isinstance(end_lineno, int)").

Do you consider this enough to make the proposed positioning of
'end_lineno' acceptible?  The tradeoff is some immediate annoyance
versus forever annoyance of a misplaced required 'optional' parameter.

Yes, I think that would be sufficient -- we'd break rogue usages (if there are any) but at least they'd break pretty cleanly, in a way that's easy to debug.
 
Since I have in mind a possible IDLE use for end_lineno, quite different
from that of the patch author, I consider either way of adding it
superior to never adding it.

Agreed.

--
--Guido van Rossum (python.org/~guido)