[Python-ideas] Delay evaluation of annotations

Steven D'Aprano steve at pearwood.info
Sun Sep 25 22:14:44 EDT 2016


On Sun, Sep 25, 2016 at 01:55:09AM +0000, אלעזר wrote:

> 1. Please consider disallowing the use of side effects of any kind in
> annotations, 

That is *simply not possible* in Python.

Actually, no, that's not quite correct. One way to prohibit side-effects 
would be to make all annotations string literals, and ONLY string 
literals. Or possibly bare names (assuming current semantics for local 
variable name lookup):

def func(arg:'no possible side effects here') -> OrHere:
    ...


But as soon as allow such things as union types and lists, then all bets 
are off:

def func(arg:Sequence[list]): ...

There is no way of prohibiting side effects in type(Sequence).__getitem__ 
once it is called.

Nor would we want to. The ability to shadow or monkey-patch types for 
mocking, testing, debugging etc, including the ability to have them call 
print, or perform logging, is a feature beyond price. We don't need it 
often, but when we do, the ability to replace Sequence with a mock that 
may have side-effects is really useful.



> in that it is not promised when it will happen, if at all. So
> that a change 3 years from now will be somewhat less likely to break
> things. Please consider doing this for version 3.6; it is feature-frozen,
> but this is not (yet) a feature, 

It has been a feature since Python 3.0 that annotations are evaluated at 
runtime. And that means the possibility of side-effects. So, yes, it is 
already a feature.

Even if you get the behaviour that you want, the absolute earliest it 
could happen would be after a deprecation period of at least one point 
release. That means:

* 3.7 introduces a DeprecationWarning whenever you use annotations 
  which aren't simple names or strings;

* and possibly a __future__ import to give the new behaviour;

* and 3.8 would be the earliest it could be mandatory.

Forget about 3.6 -- that's already frozen apart from bug fixes, 
and this is not a bug.


> and I got the feeling it is hardly controversial.

It is extremely controversial. The fact that you can say that it isn't 
suggests that you're not really paying attention to what we're saying. 
Even if what you ask for is easy (it isn't), or even possible, it still 
goes completely and utterly against the normal semantics of Python and 
the philosophy of the language.

No, under normal circumstances nobody is going to write:

def func(arg: mylist.append(value) or int):
    ...

in production code. That's simply bad style. But we don't ban things 
just because they are bad style. Circumstances are not always normal, 
sometimes it is useful to use dirty hacks (but hopefully not in 
production code), and Python is not a B&D language where everything is 
prohibited unless explicitly allowed.


> I really have no interest in wasting the time of anybody here.

And yet, despite receiving virtually no interest from any other person, 
you continue to loudly and frequently argue for this proposal.

[...]
> 2. A refined proposal for future versions of the language: the ASTs of the
> annotation-expressions will be bound to __raw_annotations__.
>    * This is actually more in line to what PEP-3107 was about ("no assigned
> semantics"; except for a single sentence, it is only about expressions. Not
> objects).

All expressions evaluate to a value. And all values in Python are 
objects. I don't understand what distinction you think you are making 
here. Are you suggesting that Python should gain some sort of values 
which aren't objects?


>    * This is helpful even if the expression is evaluated at definition
> time, and can help in smoothing the transformation.
> 
> 3. The main benefit from my proposal is that contracts (examples,
> explanations, assertions, and types) are naturally expressible as (almost)
> arbitrary Python expressions, but not if they are evaluated or evaluatable,
> at definition time, by the interpreter. Why: because it is really written
> in a different language - *always*. 

Wrong. Not always.

The proof is that Python exists. Contracts, types, assertions etc in 
Python *are* written in Python. That's the end of the story.

You cannot argue that "contracts are written in a different language" 
because that is untrue. Contracts are written in Python, and we wouldn't 
have it any other way.

> This is the real reason behind the
> existence, and the current solutions, of the forward reference problem. In
> general it is much more flexible than current situation.

The forward reference problem still exists in languages where type 
declarations are a separate language, e.g. Pascal, C++, Java, etc.

http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4754974
http://stackoverflow.com/questions/951234/forward-declaration-of-nested-types-classes-in-c

etc. There are many ways around it. One way is to make the language so 
simple that forward declarations aren't relevant. Another is to make 
multiple passes over the source code. Another is to introduce an 
explicit "forward" declaration, as in some dialects of Pascal. Python 
uses strings.


> I believe the last point has a very good reason, as explained later: it is
> an interpretation of a different language, 

But it *isn't* such a thing, nor should it be.

> foreign to the interpreter,
> although sometimes close enough to be useful.

Sometimes close enough to be useful. Does that mean it is usually 
useless? *wink*


> It is of course well formed,
> so the considerations are not really security-related.

You've talked about eval'ing the contents of __raw_annotations__. That 
means if somebody can fool you into storing arbitrary values into 
__raw_annotations__, then get you to call annotations() or use inspect, 
they can execute arbitrary code.

How is this not a security concern?

It might be hard to exploit, since it requires the victim to do 
something like:

myfunc.__raw_annotations__['arg'] = something_untrusted

but if exploited, the consequences are major: full eval of arbitrary 
code.

In comparison, the only similar threat with annotations today is if the 
victim is fooled into building a string containing a def 
with annotations, then passing it to exec:

annot = something_untrusted
code = """def func(arg: %s):
   ...
""" % annot
exec(code)

but if you're using exec on an untrusted string you have already lost. 
So annotations as they exist now aren't adding any new vulnerabilities. 


Still, the important thing here is not the (hard to exploit) potential 
vulerability, but the fact that your proposal would lead to a massive 
increase in the complexity of the language (a whole new compiler/ 
iterpreter for the second, types-only, mini-language) and an equally 
major *decrease* in useful functionality.

Have I mentioned that I'm against this? If not, I'm against it.



-- 
Steve


More information about the Python-ideas mailing list