[Python-ideas] Delay evaluation of annotations

Sat Sep 24 21:55:09 EDT 2016

I promised not to bother you, but I really can't. So here's what I felt I
have to say. This email is quite long. Please do not feel obliged to read
it. You might find some things you'll want to bash at the end though :)

Short-ish version:

1. Please consider disallowing the use of side effects of any kind in
annotations, in that it is not promised when it will happen, if at all. So
that a change 3 years from now will be somewhat less likely to break
things. Please consider doing this for version 3.6; it is feature-frozen,
but this is not (yet) a feature, and I got the feeling it is hardly
controversial.

I really have no interest in wasting the time of anybody here. If this
request is not something you would ever consider, please ignore the rest of
this email.

2. A refined proposal for future versions of the language: the ASTs of the
annotation-expressions will be bound to __raw_annotations__.
   * This is actually more in line to what PEP-3107 was about ("no assigned
semantics"; except for a single sentence, it is only about expressions. Not
objects).
   * This is helpful even if the expression is evaluated at definition
time, and can help in smoothing the transformation.

3. The main benefit from my proposal is that contracts (examples,
explanations, assertions, and types) are naturally expressible as (almost)
arbitrary Python expressions, but not if they are evaluated or evaluatable,
at definition time, by the interpreter. Why: because it is really written
in a different language - *always*. This is the real reason behind the
existence, and the current solutions, of the forward reference problem. In
general it is much more flexible than current situation.

4. For compatibility, a new raw_annotations() function will be added, and a
new annotations() function will be used to get the eval()ed version of
them. Similarly to dir(), locals() and globals().
  * Accessing __annotations__ should work like calling annotations(), but
frowned upon, as it might disappear in the far future.
  * Of course other `inspect` functions should give the same results as
today.
  * Calling annotations()['a'] is like a eval(raw_annotations()['a']) which
resembles eval(raw_input()).

I believe the last point has a very good reason, as explained later: it is
an interpretation of a different language, foreign to the interpreter,
although sometimes close enough to be useful. It is of course well formed,
so the considerations are not really security-related.

I am willing to do any hard work that will make this proposal happen
(evaluating existing libraries, implementing changes to CPython, etc) given
a reasonable chance for acceptance.

Thank you,
Elazar

---

Long version:

Stephen - I read your last email only after writing this one; I think I
have partially addressed the lookup issue (with ASTs and scopes), and
partially agree: if there's a problem implementing this feature, I should
look deeper into it. But I want to know that it _might_ be considered
seriously, _if_ it is implementable. I also think that Nick refuted the
claim that evaluation time and lookup *today* are so simple to explain. I
know I have hard time explaining them to people.

Nick, I have read your blog post about the high bar required for
compatibility break, and I follow this mailing list for a while. So I agree
with the reasoning (from my very, very little experience); I only want to
understand where is this break of compatibility happen, because I can't see
it.

Chris:

On Fri, Sep 23, 2016 at 6:59 PM Chris Angelico <rosuav at gmail.com> wrote:

> On Fri, Sep 23, 2016 at 11:58 PM, אלעזר <elazarg at gmail.com> wrote:

> "Unknown evaluation time" is scary. _for expressions_, which might have
> side
> > effects (one of which is running time). But annotations must be pure by
> > convention (and tools are welcome to warn about it). I admit that I
> propose
> > breaking the following code:
> >
> > def foo(x: print("defining foo!")): pass
> >
> > Do you know anyone who would dream about writing such code?
>
> Yes, side effects make evaluation time scary. But so do rebindings,
> and any other influences on expression evaluation. Good, readable code
> generally follows the rule that the first instance of a name is its
> definition.

No, it isn't. I guess that even the code you write or consider to be
excellent and readable still contains functions that use entities defined
only later in the code. It is only when you follow execution path that you
should be already familiar with the names.

I think rebinding is only scary when it is combined with side effect or
when the name lookup is not clear. And why do you call it _re_binding?

<snip>
> >> > > class MyClass:
> >> > >     pass
> >> > >
> >> > > def function(arg: MyCalss):
> >> > >     ...
> >> > >
> >> > > I want to see an immediate NameError here, thank you very much
> >> >
> >> > Two things to note here:
> >> > A. IDEs will point at this NameError
> >>
> >> Some or them might. Not everyone uses an IDE, it is not a requirement
> >> for Python programmers. Runtime exceptions are still, and always will
> >> be, the primary way of detecting such errors.
> >
> > How useful is the detection of this error at production?
>
> The sooner you catch an error, the better. Always.
>
>
No. No. No. If a code in production will fail at my client's site because
of a mispelled annotation (unused by runtime tools), I will be mad. *On the
language*. It is just as reasonable as failing because of mispled
documentation. (My suggestion does not prevent it completely of course.
Nothing will. I only say this is unhelpful).

<snip>

It's worth reiterating, too, that function annotations have had the
> exact same semantics since Python 3.0, in 2008.

When was this semantics decided and for what purposes, if I may ask?
because the PEP (2006) explicitly states that "this PEP makes no attempt to
introduce any kind of standard semantics". The main relevant paragraph
reads (I quote the PEP, my own emphasis):

"2. Function annotations are nothing more than a way of associating
arbitrary Python EXPRESSIONS with various parts of a function at
compile-time.
By itself, Python does not attach ANY PARTICULAR MEANING or significance to
annotations. Left to its own, Python simply makes these EXPRESSIONS available
as described in Accessing Function Annotations below. The only way that
annotations take on meaning is when they are interpreted by third-party
libraries. These annotation consumers can do anything they want with a
function's annotations."

Amen to that! Word by word as my suggestion. Why aren't these _expressions_
available to me, as promised? <baby crying>

Sadly, a few paragraphs later, the PEP adds that "All annotation
expressions are evaluated when the function definition is executed, just
like default values." - Now please explain to me how is that attaching "no
particular meaning or significance to annotations". You practically
evaluate them, for heavens sake! this is plain and simple "attached
meaning" and "standard semantics". Unusefully so. I put there an
expression, and all I got is a lousy object.

> Changing that now
> would potentially break up to eight years' worth of code, not all of
> which follows PEP 484. When Steve mentioned 'not breaking other uses
> of annotations', he's including this large body of code that might
> well not even be visible to us, much less under python.org control.
> Changing how annotations get evaluated is a *major, breaking change*,
> so all you can really do is make a style guide recommendation that
> "annotations should be able to be understood with minimal external
> information" or something.
>

As I said, it is a strong argument - given an example of such a potential
break for non-convoluted code. I want to see such an example.

But why is "deprecating side effects in annotation's
definition-time-execution" considered a breaking change? It is just a
documentation. Everything will work as always has. Even edge cases. I would
think this is possible even for the feature-freezed 3.6. Like saying "We've
found a loophole in the language; it might get fixed in the future. Don't
count on it."

> Here's my counter-proposal.
<snip>
> Mutual2 = "Mutual2" # Pre-declare Mutual2
> class Mutual1:
>     def spam() -> Mutual2: pass
> class Mutual2:
>     def spam() -> Mutual1: pass
>
> Problem solved, no magic needed.

Problem not solved. Your counter proposal solves only certain forward
references, and requires keeping on more thing in sync, in particular
adapting the scope of the "forward declaration" to the scope of the later
definition, which may change over time and is in violation of DRY. Oh and
lastly, type checkers will scream or will work very hard to allow this
idiom.

My proposal asks for no magic at all. Unless you consider dir() and
locals() magical (they are. a bit).

Stephen:

Regarding the terminology, I said "But I think we are getting lost in the
terminology." including myself.

On Sat, Sep 24, 2016 at 10:07 PM Stephen J. Turnbull <
turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:
> Python has a very simple model of expressions. The compiler turns  them
into code.  The interpreter executes that code, except in the case where it
is "quoted" by the "def" or "lambda" keywords, in which case it's stored in
an object (and in the case of "def", registered in a namespace).

Here's the version for annotations: The compiler turns them into an AST.
The interpreter does nothing with them, except attaching them to the
__annotations__ object. Are you the average programmer? then just read
them, they should be helpful.

How simple is that?

>  Another design principle is Occam's Razor, here applied as "new kinds of
thing shall not spring up like whiskers on Barry's chin." Yes, function
annotations need new syntax and so are a new kind of thing to that extent.
*Their values don't need to be,*

Their values don't need to be there at all. All is needed is their
structure. The AST. Their value is of very little use, actually. And I'm
familiar with the typing module; it is bent here and there to match this
need to have a "value" or an object when you actually need only the
structure. I don't invent "new thing" any more than is already there.

I have a strong belief that _already there_ is a new thing. In Python it is
called "types", "annotations" and some forms of "documentations". In other
languages it is called in other names. I don't mind the merging of this
concept with the concept of Expression - I actually think it is brilliant.
Sadly, the _interpreter_ does not understand it. It is (brilliantly again)
admitted in the PEP, but sadly the interpreter does not seem to get it, and
it tries to execute it anyway. So people shush it with a quote. Well done.
Now where is that Expression thing? And why can't my editor highlight it?

>    (2) it's a clear pessimization in the many cases where those
>        values are immutable or very rarely mutated, and the use case
>        (occasional) of keeping state in mutable values.  The thunk
>        approach is more complex, for rather small benefit.  Re "small
>        benefit", IMHO YMMV, but at least with initialization Guido is
>        on record saying it's the RightThang[tm] (despite a propensity
>        of new users to write buggy initializations).

Keeping an AST without evaluation at all is still a clear pessimization?

> Chris argues that "compile to thunk" is incoherent, that
>    expressions in function bodies are no different than anywhere else
>    -- they're evaluated when flow of control reaches them.

I argue that flow of control should not reach annotations at all. Not the
control of the interpreter. It does not understand what's written there,
and should not learn.

>  (5)  <snip>
>     A variable not defined because it's on the path not taken, or even
>    a function: they just don't exist as far as the interpreter is
>    concerned -- there's no way to find them from Python

See my previous comment. The interpreter should not look for them. You know
what? It can try to give a hint in name resolution. As a favor. If the
tools can't find it, it's their business to report.

> "[G]ood general theory does not search for the maximum generality, but
for the right generality."

I believe this is the right generality "because maths" and my own
intuition. This should not convince you at all.

> Your proposal of "evaluate to thunk" (possibly
>    incorporating the property-based magic Alexander proposed) might
>    be right *too*, but it's far from obviously better to me

As you've noticed, I refined my proposal to "don't evaluate".

And here's my attempt at presenting the "because maths" argument you
probably don't want to hear: it will allow natural and well-based way to
express contracts and dependent types, which is a much more natural and
flexible way to type dynamically-typed languages such as Python. It is
nothing new really; it is based on a 40-years old understanding that types
are propositions *and propositions are types*. And I want to use it. From
simple to complex:

@typecheck
def to_float(x: int or str) -> float:
    ...

(Oh, @typecheck can't access this, so they invented Union. Oh well. TODO:
explain to junior what a Union[int, str] is)

@typecheck
def __add__(self, x: int and float) -> float:
    ...

This should help resolve a real problem in type checkers regarding
overloads and overlapping types. Except @typecheck can only see the object
"float". And they did not invent Intersection[] yet. Bummer, but fixable.

@dependent
def call_foo_twice(x: x.foo()) -> None:
    x.foo()
    x.foo()

Uh, I don't know. Perhaps x.foo() has side effect and I'm not sure how
"dependent" works, and perhaps it is not so clear. Try again:

@dependent
def call_foo_twice(x: hasattr(x, "foo") and is_callable(x.foo)) -> None:
    x.foo()
    x.foo()

But why NameError? :(
Let's define contracts

@contract
def divmod(x: int, y: int and y != 0) -> (x//y, x % y):
    return # an optimized version

NameError again? :(

Not every function _should_ be annotated this way, but why can't you
_allow_ this kind of annotation? These make a needless syntax error for an
obviously meaningful expression.

What if I want to specify "a class that subclasses Abstract but can be
istantiated? I need it because otherwise mypy resorts to allowing unsafe
code:

def create(cls: typing.Type[Abstract] and cls(...) ) -> Base:
    return cls()

NameError again. Why? Not because _you_ (people) don't understand it. No.
It is because the _interpreter_ claims to understand it, but it doesn't. It
cannot, because Python is not intended to be a specification language, and
probably should not be. Even with simple types it doesn't, really. It just
happen to look up the right _class_ which is not a type but is useful
nonetheless; when I write "x: int" I actually mean "x: isinstance(x, int)",
but the interpreter doesn't get it and should not get it. And what if I
want "x: type(x)==int"? It has its uses.

Regarding my "not an expression" claim: this is a proposition, an
assumption, a precondition, or explanation - different incarnation of the
same idea - as is any annotation system I have seen (except possibly
injection, which will not break). Including Nick's "begin". Now notice this
nice command-line-parser - the annotations there can still be strings,
combined easily with type checkers. Why? because it targets human users,
that's why. And the interpreter does not try to understand them because it
does not understand English. Well, it does not understand Type-ish or
Contract-ish, or any other Spec-ish. There are external tools for that,
thank you very much. Just give them those things that you (rightly)
consider to be expressions. I like the raw.

Elazar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160925/de456ce9/attachment-0001.html>