[Cython] Py3 annotation syntax for static typing

Stefan Behnel stefan_ml at behnel.de
Sat Jul 13 07:32:38 CEST 2013


Jukka Lehtosalo, 12.07.2013 14:03:
>> On Thu, Jul 11, 2013 at 6:52 PM, Stefan Behnel wrote:
> http://mypy-lang.blogspot.de/2013/07/mypy-switches-to-python-compatible.html
> 
>> It says that you are starting to adopt the Py3 function annotation syntax
>> for mypy. I think we should try to keep that in sync for both mypy and
>> Cython. It would be bad to have two ways to do it.
> 
> Agreed.
> 
>> When we discussed this for Cython (see the trac ticket [1] and Wiki entry
>> [2] below), also on the mailing list (back in 2008 I guess), the main
>> problem was that it's not immediately clear what should be considered type
>> annotations by the compiler and what's arbitrary "other stuff" that users
>> put into declarations. There's also no obvious way to allow for multiple
>> independent annotations. Should they be in a tuple? In a dict? What would
>> be dict key be? How can we prevent collisions and conflicts when multiple
>> tools base their features on annotations?
> 
> I think that this general problem can approached from several
> directions:
> 
> (1) Allow different annotation styles to be used in different modules,
>     but expect a uniform annotation style within a module. Each tool
>     should be able to recognize modules that use its annotation
>     syntax and ignore annotations in other modules.

That would still conflict with the case of more than one kind of
annotation, e.g. a docstring and (one or potentially more) static types.


> (2) Support a fallback annotation syntax in each tool that doesn't use
>     Python 3 annotations, such as function decorators. This would let
>     users continue to use arbitrary annotations, at the cost of an
>     uglier type annotation syntax. Also, the syntax must support
>     stacking multiple annotations. Function decorators are nice in
>     this respect, but annotations in comments may be more troublesome.

Cython has that in its Pure Python Mode syntax. I think it's ok to use tool
specific decorators approach here, although it would be nice if we could
keep at least the names similar for both Cython and MyPy. So, please take a
look through the Pure Mode page if you want to go this route, and let's
discuss anything that doesn't fit one of the tools for some reason.


> (3) Support mixing multiple independent annotation styles within a
>     single function, using the Python 3 annotation syntax (as you
>     mentioned, if I understood your point correctly).

Yes, I think there should be a way to do that. I haven't really seen any
tools out in the wild that use the annotation syntax for any of their
features, so this still seems to be uncovered ground. I think using a dict
would be a good idea. Or a multi-level approach, where the annotation value
can be

a) a single value, which one tool would recognise and other tools would
reject (or ignore? sounds fragile to ignore unknown values...). Maybe just
ignore plain (doc-)strings and reject all other values that a tool cannot
recognise.

b) a 2-tuple containing one (doc-)string and one tool specific annotation
value, as in a). Maybe require the docstring to appear after the annotation
value, as the docstring is likely to be longer, and the annotation value
might also be in a string. So, a 2-tuple (type, docstring).

c) a dict that maps tool specific keys to annotation values. Each tool
would only care about its own keys and ignore all others.

I think this also provides for an easy upgrade path: start with a single
value, optionally add a docstring to it in a tuple. If you need more tools,
switch to an explicit dict. People will usually know which tool the
original annotations belong to (and if not, they can just rip them out,
right?), so the transition towards a more verbose syntax should be quite
straight forward, and the multi-level approach helps in keeping things
simple in the simple case.

The dict keys should then be either an identifier that multiple tools
understand (e.g. "type": list), or use a tool specific prefix, e.g. "cy:",
"my:", "jy:", etc. Note that "type": int might be ambiguous, as Cython
would consider it a C int. But I guess "ctype": int would solve that, and
still be usable by multiple tools.

An alternative would be to allow for (or require?) prefixed string values
as simple/2-tuple annotation values, e.g. "ctype:int" or "cy:type:int".
That would also allow to distinguish simple docstrings from simple
annotation values in strings.

Or, well, rely on module namespaces and imports to distinguish between
annotation values. I see that mypy has a "typing" module which you can import.

http://mypy-lang.org/tutorial.html#typing

In Cython, there's the special "cython" module namespace. Personally, I
find "typing" a bit broad as a top-level module name. There are also the
ABCs in Py3, spread over the modules "abc", "collections.abc" (in Py3.3,
previously in "collections") and "types" in the standard library. Some of
your annotations seem to overlap with those. If the goal is to avoid
duplication, it would be good to integrate what's there anyway.


> (4) Agree on a single common type annotation style that can be used by
>     multiple projects.
> 
> I think (1) is essential, (2) is important, and I'm undecied about
> (3).  I'm not sure whether (4) is practically feasible, as different
> projects have different goals and features they want to support.

See above. I think there can and should be a mix, as long as different
projects keep an eye on each other. Keep common what we can, diverge where
we it makes sense.


>> It's mainly a namespace problem. The annotation namespace is essentially
>> flat, and I think that only practical usage can eventually establish
>> suitable conventions.
> 
>> I'd like to discuss this, maybe we can come up with a suitable Best
>> Practice. What's your opinion on the issue so far?
> 
> I think we should first focus on (1) above, as it should be easy to
> solve.  As for (3), we should come up with some real-world use cases
> or examples.  For example, here are a few potential scenarios:
> 
> (A) A programmer uses Python 3 annotations for internal purposes,
>     using an ad-hoc syntax. The programmer would like to speed up the
>     code by using Cython, but she wants the code to remain
>     Python-compatible, and she wants to continue using Python 3
>     annotations for internal purposes.
> 
> (B) A programmer uses mypy for type checking his program. He'd like to
>     speed up a critical function using Cython, but he also wants to
>     retain mypy type checking.

I think both cases are covered by the approach above. Although users might
end up having (or wanting?) to use both or all three syntaxes in the same
module, in order to cover all cases. This might have an impact on code
readability.

My guess is that users who start using the dict based annotation style in a
module at some point are best advised to convert the entire module at that
point, in order to keep it simpler to read. An automated conversion tool
based on 2to3 could then do the trick.


>> There's also our current Pure Python syntax mode for everything that
>> cannot be expressed with function annotations:
> 
>> http://docs.cython.org/src/tutorial/pure.html
> 
> I'm already somewhat familiar with the Pure Python syntax, though I
> haven't tried it in practice.
> 
>> I would guess that mypy will eventually need something similar.
> 
> Mypy already (only) has a pure Python syntax, as all mypy constructs
> are now syntactically valid Python, and valid mypy programs are
> basically also runnable Python programs.  In fact, currently a Python
> 3.2 or later VM is the only supported way of running mypy programs.

Ok, so you basically have a way to statically type variables during
assignments.

http://mypy-lang.org/tutorial.html#collectiontypes

Cython has a slightly different approach here in that it requires either an
external decorator (@cython.locals()) or an explicit function call during
assignments (cython.declare()). I must say, I find neither of the three
approaches really pretty, but I guess there simply is no perfect way to do
these things, once you accept that you have to put static type annotations
*somewhere* in your code.

Stefan



More information about the cython-devel mailing list