[Python-ideas] Delay evaluation of annotations
Steven D'Aprano
steve at pearwood.info
Sun Sep 25 22:14:44 EDT 2016
On Sun, Sep 25, 2016 at 01:55:09AM +0000, אלעזר wrote:
> 1. Please consider disallowing the use of side effects of any kind in
> annotations,
That is *simply not possible* in Python.
Actually, no, that's not quite correct. One way to prohibit side-effects
would be to make all annotations string literals, and ONLY string
literals. Or possibly bare names (assuming current semantics for local
variable name lookup):
def func(arg:'no possible side effects here') -> OrHere:
...
But as soon as allow such things as union types and lists, then all bets
are off:
def func(arg:Sequence[list]): ...
There is no way of prohibiting side effects in type(Sequence).__getitem__
once it is called.
Nor would we want to. The ability to shadow or monkey-patch types for
mocking, testing, debugging etc, including the ability to have them call
print, or perform logging, is a feature beyond price. We don't need it
often, but when we do, the ability to replace Sequence with a mock that
may have side-effects is really useful.
> in that it is not promised when it will happen, if at all. So
> that a change 3 years from now will be somewhat less likely to break
> things. Please consider doing this for version 3.6; it is feature-frozen,
> but this is not (yet) a feature,
It has been a feature since Python 3.0 that annotations are evaluated at
runtime. And that means the possibility of side-effects. So, yes, it is
already a feature.
Even if you get the behaviour that you want, the absolute earliest it
could happen would be after a deprecation period of at least one point
release. That means:
* 3.7 introduces a DeprecationWarning whenever you use annotations
which aren't simple names or strings;
* and possibly a __future__ import to give the new behaviour;
* and 3.8 would be the earliest it could be mandatory.
Forget about 3.6 -- that's already frozen apart from bug fixes,
and this is not a bug.
> and I got the feeling it is hardly controversial.
It is extremely controversial. The fact that you can say that it isn't
suggests that you're not really paying attention to what we're saying.
Even if what you ask for is easy (it isn't), or even possible, it still
goes completely and utterly against the normal semantics of Python and
the philosophy of the language.
No, under normal circumstances nobody is going to write:
def func(arg: mylist.append(value) or int):
...
in production code. That's simply bad style. But we don't ban things
just because they are bad style. Circumstances are not always normal,
sometimes it is useful to use dirty hacks (but hopefully not in
production code), and Python is not a B&D language where everything is
prohibited unless explicitly allowed.
> I really have no interest in wasting the time of anybody here.
And yet, despite receiving virtually no interest from any other person,
you continue to loudly and frequently argue for this proposal.
[...]
> 2. A refined proposal for future versions of the language: the ASTs of the
> annotation-expressions will be bound to __raw_annotations__.
> * This is actually more in line to what PEP-3107 was about ("no assigned
> semantics"; except for a single sentence, it is only about expressions. Not
> objects).
All expressions evaluate to a value. And all values in Python are
objects. I don't understand what distinction you think you are making
here. Are you suggesting that Python should gain some sort of values
which aren't objects?
> * This is helpful even if the expression is evaluated at definition
> time, and can help in smoothing the transformation.
>
> 3. The main benefit from my proposal is that contracts (examples,
> explanations, assertions, and types) are naturally expressible as (almost)
> arbitrary Python expressions, but not if they are evaluated or evaluatable,
> at definition time, by the interpreter. Why: because it is really written
> in a different language - *always*.
Wrong. Not always.
The proof is that Python exists. Contracts, types, assertions etc in
Python *are* written in Python. That's the end of the story.
You cannot argue that "contracts are written in a different language"
because that is untrue. Contracts are written in Python, and we wouldn't
have it any other way.
> This is the real reason behind the
> existence, and the current solutions, of the forward reference problem. In
> general it is much more flexible than current situation.
The forward reference problem still exists in languages where type
declarations are a separate language, e.g. Pascal, C++, Java, etc.
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4754974
http://stackoverflow.com/questions/951234/forward-declaration-of-nested-types-classes-in-c
etc. There are many ways around it. One way is to make the language so
simple that forward declarations aren't relevant. Another is to make
multiple passes over the source code. Another is to introduce an
explicit "forward" declaration, as in some dialects of Pascal. Python
uses strings.
> I believe the last point has a very good reason, as explained later: it is
> an interpretation of a different language,
But it *isn't* such a thing, nor should it be.
> foreign to the interpreter,
> although sometimes close enough to be useful.
Sometimes close enough to be useful. Does that mean it is usually
useless? *wink*
> It is of course well formed,
> so the considerations are not really security-related.
You've talked about eval'ing the contents of __raw_annotations__. That
means if somebody can fool you into storing arbitrary values into
__raw_annotations__, then get you to call annotations() or use inspect,
they can execute arbitrary code.
How is this not a security concern?
It might be hard to exploit, since it requires the victim to do
something like:
myfunc.__raw_annotations__['arg'] = something_untrusted
but if exploited, the consequences are major: full eval of arbitrary
code.
In comparison, the only similar threat with annotations today is if the
victim is fooled into building a string containing a def
with annotations, then passing it to exec:
annot = something_untrusted
code = """def func(arg: %s):
...
""" % annot
exec(code)
but if you're using exec on an untrusted string you have already lost.
So annotations as they exist now aren't adding any new vulnerabilities.
Still, the important thing here is not the (hard to exploit) potential
vulerability, but the fact that your proposal would lead to a massive
increase in the complexity of the language (a whole new compiler/
iterpreter for the second, types-only, mini-language) and an equally
major *decrease* in useful functionality.
Have I mentioned that I'm against this? If not, I'm against it.
--
Steve
More information about the Python-ideas
mailing list