[Python-ideas] Delay evaluation of annotations

Sat Sep 24 23:10:45 EDT 2016

On Sun, Sep 25, 2016 at 11:55 AM, אלעזר <elazarg at gmail.com> wrote:
> Short-ish version:
>
> 1. Please consider disallowing the use of side effects of any kind in
> annotations, in that it is not promised when it will happen, if at all. So
> that a change 3 years from now will be somewhat less likely to break things.
> Please consider doing this for version 3.6; it is feature-frozen, but this
> is not (yet) a feature, and I got the feeling it is hardly controversial.
>
> I really have no interest in wasting the time of anybody here. If this
> request is not something you would ever consider, please ignore the rest of
> this email.

I don't think Python has any concept of *disallowing* side effects. As
soon as arbitrary objects can be called and/or subscripted, arbitrary
code can be executed. However, a style guide may *discourage*
extensive side effects, and this I would agree with - not for reasons
of future change, but for reasons of simplicity and readability.

> 3. The main benefit from my proposal is that contracts (examples,
> explanations, assertions, and types) are naturally expressible as (almost)
> arbitrary Python expressions, but not if they are evaluated or evaluatable,
> at definition time, by the interpreter. Why: because it is really written in
> a different language - *always*. This is the real reason behind the
> existence, and the current solutions, of the forward reference problem. In
> general it is much more flexible than current situation.

So, basically, you want annotations to be able to make use of names
defined in the object they're annotating. That's a reasonable summary
of the idea, if I have this correct. I'll trim out a ton of quoted
material that digs into details.

Ultimately, though, you're asking to change something that has been
this way since *Python 3.0*. You're not asking for a tiny tweak to a
feature that's new in 3.6. If you were, perhaps this could be done,
despite feature freeze; but you're breaking compat with eight years of
Pythons, and that's almost certainly not going to happen.

> Chris:
>
> On Fri, Sep 23, 2016 at 6:59 PM Chris Angelico <rosuav at gmail.com> wrote:
>> Good, readable code
>> generally follows the rule that the first instance of a name is its
>> definition.
>
>
> No, it isn't. I guess that even the code you write or consider to be
> excellent and readable still contains functions that use entities defined
> only later in the code. It is only when you follow execution path that you
> should be already familiar with the names.

Actually, no, I do generally stick to this pattern, builtins aside.
Obviously there are times when you can't (mutually exclusive
functions, for instance), but those are pretty rare. Here's an example
program of mine:

https://github.com/Rosuav/LetMeKnow/blob/master/letmeknow.py

There is one star-import, which breaks this pattern (the global name
CLIENT_SECRET comes from keys.py), and which I consider to be a
failing under this principle; but it's better than most of the
alternatives, and like all style recommendations, "define before use"
is a rule that can be broken.

>> The sooner you catch an error, the better. Always.
>>
>
> No. No. No. If a code in production will fail at my client's site because of
> a mispelled annotation (unused by runtime tools), I will be mad. *On the
> language*. It is just as reasonable as failing because of mispled
> documentation. (My suggestion does not prevent it completely of course.
> Nothing will. I only say this is unhelpful).

Then I strongly disagree. If it's going to fail at the client's site,
I want it to first fail on my computer.

>> It's worth reiterating, too, that function annotations have had the
>> exact same semantics since Python 3.0, in 2008.
>
>
> When was this semantics decided and for what purposes, if I may ask? because
> the PEP (2006) explicitly states that "this PEP makes no attempt to
> introduce any kind of standard semantics".

Decorators also have clearly defined local semantics and completely
undefined overall semantics. If you see this in a .py file:

@spaminate(ham=1)
def frobber(): pass

you know exactly what's happening, on a mechanical level: first
spaminate(ham=1) will be called, and then the result will be called
with frobber as an argument, and the result of that bound to the name
frobber. But exactly what the spaminate decorator does is not Python's
business. It might make frobber externally callable (cf routing
decorators in Flask), or it might modify frobber's behaviour (eg
require that ham be 1 before it'll be called), or it might trigger
some sort of run-time optimization (memoization being an easy one, and
actual code modifications being also possible).

Annotations are the same. There's a clearly defined local syntactic handling:

@multicall
def frobber(ham: [10,20,30]): pass

but nothing in the language that says what this truly means. In this
case, I'm envisioning a kind of special default argument handling that
says "if you don't provide a ham argument, call frobber three times
with the successive values from ham's annotation". But you can be
absolutely certain that, on the mechanical level, what happens is that
the expression "[10,20,30]" gets evaluated, and the result gets
stashed into the function's __annotations__.

In contrast, function default arguments have *both* forms of semantics
clearly defined. The expression is evaluated and the result stashed
away; and then, when the function is called, if there's no argument,
the default is used.

> But why is "deprecating side effects in annotation's
> definition-time-execution" considered a breaking change? It is just a
> documentation. Everything will work as always has. Even edge cases. I would
> think this is possible even for the feature-freezed 3.6. Like saying "We've
> found a loophole in the language; it might get fixed in the future. Don't
> count on it."

Deprecating in the sense of "style guides recommend against this" is
fine. PEP 8 has been updated periodically, and it doesn't break
anyone's code (except MAYBE linters, and even then they're not broken,
just not up-to-date). But an actual code change that means that Python
3.7 will reject code that Python 3.5 accepted? That's a breaking
change. And the purpose of your documentation-only deprecation is
exactly that, or possibly Python 3.8 or 3.9, but timeframe doesn't
change the fact that it will break code.

>> Here's my counter-proposal.
> <snip>
>> Mutual2 = "Mutual2" # Pre-declare Mutual2
>> class Mutual1:
>>     def spam() -> Mutual2: pass
>> class Mutual2:
>>     def spam() -> Mutual1: pass
>>
>> Problem solved, no magic needed.
>
> Problem not solved. Your counter proposal solves only certain forward
> references, and requires keeping on more thing in sync, in particular
> adapting the scope of the "forward declaration" to the scope of the later
> definition, which may change over time and is in violation of DRY. Oh and
> lastly, type checkers will scream or will work very hard to allow this
> idiom.

Type checkers that comply with PEP 484 are already going to support
this notation, because "Mutual2" is a valid annotation. All I've done
differently is make a simple assignment, in the same way that typevars
get assigned.

> Keeping an AST without evaluation at all is still a clear pessimization?

The AST for an expression usually takes up more memory than the result
of it, yeah.

> And here's my attempt at presenting the "because maths" argument you
> probably don't want to hear: it will allow natural and well-based way to
> express contracts and dependent types, which is a much more natural and
> flexible way to type dynamically-typed languages such as Python. It is
> nothing new really; it is based on a 40-years old understanding that types
> are propositions *and propositions are types*. And I want to use it. From
> simple to complex:
>
> @typecheck
> def to_float(x: int or str) -> float:
>     ...

Please let's not go down this path. Already I have to explain to my
students that this won't work:

if response == "yes" or "y":

If it *does* work in annotations but doesn't work everywhere else,
that would be extremely confusing.

> @typecheck
> def __add__(self, x: int and float) -> float:
>     ...
>
> This should help resolve a real problem in type checkers regarding overloads
> and overlapping types. Except @typecheck can only see the object "float".
> And they did not invent Intersection[] yet. Bummer, but fixable.

I'm not sure what the intersection of int and float would be, but
perhaps you mean this more like Java's interfaces - something that
"implements X" and "implements Y" is the intersection of the types X
and Y.

> Let's define contracts
>
> @contract
> def divmod(x: int, y: int and y != 0) -> (x//y, x % y):
>     return # an optimized version
>
> NameError again? :(

Now, this is where stuff starts to get interesting. You want to be
able to define an assertion in terms of the variables you're creating
here. In effect, you have something like this:

def divmod(x, y):
    assert isinstance(x, int)
    assert isinstance(y, int) and y != 0
    ... # optimized calculation
    assert ret == (x // y, x % y)
    return ret

As ideas go, not a bad one. Not really compatible with annotations,
though, and very difficult to adequately parse. If you want to flesh
this out as your proposal, I would suggest setting this thread aside
and starting over, explaining (a) why actual assertions aren't good
enough, and (b) how annotations could be used without breaking
compatibility.

> What if I want to specify "a class that subclasses Abstract but can be
> istantiated? I need it because otherwise mypy resorts to allowing unsafe
> code:
>
> def create(cls: typing.Type[Abstract] and cls(...) ) -> Base:
>     return cls()
>
> NameError again. Why? Not because _you_ (people) don't understand it.

Actually, I don't understand exactly what this should do. Does it
assert that cls can be instantiated with some unknown args? Because
you then instantiate it with no args. What does cls(...) require?

> And what if I want "x: type(x)==int"? It has its uses.

Then explicitly assert that. I don't see why you should type-declare
that something is "this and not a subclass" - the whole point of
subclassing is that it still is an instance of the superclass.

Maybe what Python needs is a simple syntax for
"AST-for-this-expression". We have lambda, which means "function which
evaluates this expression"; this would operate similarly.

>>> expr = kappa: x + y
>>> ast.dump(expr)
Module(body=[Expr(value=BinOp(left=Name(id='x', ctx=Load()), op=Add(),
right=Name(id='y', ctx=Load())))])

It'd be exactly the same as ast.parse("x + y"), but might be able to
make use of the existing parsing operation, and would be more easily
syntax highlighted. (Or maybe it'd start at the Expr node instead - so
it'd be equiv to ast.parse("x + y").body[0].) Open to suggestions as
to an actual name.

With that change, your proposals could all be added in a 100% backward
compatible way. Annotations, as a feature, wouldn't change; you'd just
kappafy your contracts:

@contract
def divmod(x: int, y: kappa: int and y != 0) -> kappa: (x//y, x % y):
    ...

And then you could define contract() as either the identity function
(optimized mode, no checks done), or a wrapper function that does
run-time checks.

Maybe that, rather than making annotations magical, would solve the problem?

ChrisA