[Python-Dev] Backward incompatible change about docstring AST

INADA Naoki songofacandy at gmail.com
Tue Feb 27 08:37:24 EST 2018

Hi, all.

There is design discussion which is deferred blocker of 3.7.

## Background

An year ago, I moved docstring in AST from statements list to field of
module, class and functions.

Without this change, AST-level constant folding was complicated because
"foo" can be docstring but "fo" + "o" can't be docstring.

This simplified some other edge cases.  For example, future import must
be on top of the module, but docstring can be before it.
Docstring is very special than other expressions/statement.

Of course, this change was backward incompatible.
Tools reading/writing docstring via AST will be broken by this change.
For example, it broke PyFlakes, and PyFlakes solved it already.


Since AST doesn't guarantee backward compatibility, we can change
AST if it's reasonable.

Last week, Mark Shannon reported issue about this backward incompatibility.
As he said, this change losted lineno and column of docstring from AST.


## Design discussion

And as he said, there are three options:


> It seems to be that there are three reasonable choices:
> 1. Revert to 3.6 behaviour, with the addition of `docstring` attribute.
> 2. Change the docstring attribute to an AST node, possibly by modifying the grammar.
> 3. Do nothing.

1 is backward compatible about reading docstring.
But when writing, it's not DRY or SSOT.  There are two source of docstring.
For example: `ast.Module([ast.Str("spam")], docstring="egg")`

2 is considerable.  I tried to implement this idea by adding `DocString`
statement AST.

While it seems large change, most changes are reverting the AST changes.
So it's more closer to 3.6 codebase.  (especially, test_ast is very
close to 3.6)

In this PR, `ast.Module([ast.Str("spam")])` doesn't have docstring for
simplicity.  So it's backward incompatible for both of reading and
writing docstring too.
But it keeps lineno and column of docstring in AST.

3 is most conservative because 3.7b2 was cut now and there are some tools
supporting 3.7 already.

I prefer 2 or 3.  If we took 3, I don't want to do 2 in 3.8.  One
backward incompatible
change is better than two.

Any thoughts?

INADA Naoki  <songofacandy at gmail.com>

More information about the Python-Dev mailing list