[Python-Dev] PEP 563: Postponed Evaluation of Annotations

Sat Nov 4 21:32:13 EDT 2017

On 5 November 2017 at 02:42, Guido van Rossum <guido at python.org> wrote:
> I'm very worried about trying to come up with a robust implementation of
> this in under 12 weeks. By contrast, the stringification that Łukasz is
> proposing feels eminently doable.

I'm far from confident about that, as the string proposal inherently
breaks runtime type annotation evaluation for nested function and
class definitions, since those lose access to nonlocal variable
references (since the compiler isn't involved in their name resolution
any more).

https://www.python.org/dev/peps/pep-0563/#resolving-type-hints-at-runtime
is essentially defining a completely new type annotation specific
scheme for name resolution, and takes us back to a Python 1.x era
"locals and globals only" approach with no support for closure
variables.

Consider this example from the PEP:

    def generate():
        A = Optional[int]
        class C:
            field: A = 1
            def method(self, arg: A) -> None: ...
        return C
    X = generate()

The PEP's current attitude towards this is "Yes, it will break, but
that's OK, because it doesn't matter for the type annotation use case,
since static analysers will still understand it". Adopting such a
cavalier approach towards backwards compatibility with behaviour that
has been supported since Python 3.0 *isn't OK*, since it would mean we
were taking the step from "type annotations are the primary use case"
to "Other use cases for function annotations are no longer supported".

The only workaround I can see for that breakage is that instead of
using strings, we could instead define a new "thunk" type that
consists of two things:

1. A code object to be run with eval()
2. A dictionary mapping from variable names to closure cells (or None
for not yet resolved references to globals and builtins)

Correctly evaluating the code object in its original context would
then be possible by reading the "cell_contents" attributes of the
cells stored in the mapping and injecting them into the globals
namespace used to run the code.

This would actually be a pretty cool new primitive to have available
(since it also leaves the consuming code free to *ignore* the closure
cells, which is what you'd want for use cases like callback functions
with implicitly named parameters), and retains the current eager
compilation behaviour (so we'd be storing compiled code objects as
constants instead of strings).

If PEP 563 were updated to handle closure references properly using a
scheme like the one above, I'd be far more supportive of the proposal.

Alternatively, in a lambda based proposal that compiled code like the
above as equivalent to the following code today:

    def generate():
        A = Optional[int]
        class C:
            field: A = 1
            def method(self, arg: (lambda: A)) -> None: ...
        return C
    X = generate()

Then everything's automatically fine, since the compiler would
correctly resolve the nonlocal reference to A and inject the
appropriate closure references.

In such a lambda based implementation, the *only* tricky case is this
one, where the typevar is declared at class scope:

        class C:
            A = Optional[int]
            field: A = 1
            def method(self, arg: A) -> None: ...

Now, even without the introduction of the IndirectAttributeCell
concept, this is amenable to a pretty simple workaround:

         A = Optional[int]
         class C:
            field: A = 1
            def method(self, arg: A) -> None: ...
        C.A = A
        del A

But I genuinely can't see how breaking annotation evaluation at class
scope can be seen as a deal-breaker for the implicit lambda based
approach without breaking annotation evaluation for nested functions
also being seen as a deal-breaker for the string based approach.

Either way, there are going to be changes needed to the compiler in
order for it to still generate suitable references at compile time -
the only question would then be whether they're existing cells stored
in a new construct (a thunk to be executed with eval rather than via a
regular function call), or a new kind of cell stored on a regular
function object (implicit access to class attributes from implicitly
defined scopes in the class body).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia