[Python-ideas] Proposal: Use mypy syntax for function annotations

Guido van Rossum guido at python.org
Fri Aug 15 06:34:31 CEST 2014


On Thu, Aug 14, 2014 at 9:12 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> Guido van Rossum schrieb am 14.08.2014 um 07:24:
> > On Wed, Aug 13, 2014 at 9:06 PM, Jukka Lehtosalo wrote:
> >> You could use AnyStr to make the example work with bytes as well:
> >>
> >>   def word_count(input: Iterable[AnyStr]) -> Dict[AnyStr, int]:
> >>       result = {}  #type: Dict[AnyStr, int]
> >>
> >>       for line in input:
> >>           for word in line.split():
> >>               result[word] = result.get(word, 0) + 1
> >>       return result
> >>
> >> Again, if this is just a simple utility function that you use once or
> >> twice, I see no reason to spend a lot of effort in coming up with the
> most
> >> general signature. Types are an abstraction and they can't express
> >> everything precisely -- there will always be a lot of cases where you
> can't
> >> express the most general type. However, I think that relatively simple
> >> types work well enough most of the time, and give the most bang for the
> >> buck.
> >
> > I heartily agree. But just for the type theorists amongst us, if I really
> > wanted to write the most general type, how would I express that the
> AnyStr
> > in the return type matches the one in the argument? (I think pytypedecl
> > would use something like T <= AnyStr.)
>
> That's how Cython's "fused types" (generics) work, at least. They go by
> name: same name of the type, same type. Otherwise, use alias names, which
> make the types independent from each other.
>
> http://docs.cython.org/src/userguide/fusedtypes.html
>
> While it's a matter of definition what way to go here (same type or not),
> practice has shown that it's clearly the right decision to make identical
> types the default.
>

I don't understand those docs at all, but I do think I understand the rule
"same name, same type" and I think I like it. Let me be clear -- in this
example:

def word_count(input: Iterable[AnyStr]) -> Mapping[AnyStr, int]:
    ...

the implication would be that if the input is Iterable[bytes] the output is
Mapping[bytes, int] while if the input is Iterable[str] the output is
Mapping[str, int]. Have I got that right? I hope so, because I think it is
a nice simplifying rule that covers a lot of cases in practice. (Note:
AnyStr is a predefined type in mypy that means "str or bytes".)

BTW there are a lot of messy things to consider around bytes, and IIUC mypy
currently doesn't really cover them. Often when you write code that accepts
a bytes instance, in practice it will accept anything that supports the
buffer protocol (e.g. bytearray and memoryview). Except when you are going
to use it as a dict key, then bytearray won't work. And if you say that you
are returning bytes, you probably shouldn't be returning a memoryview or
bytearray. I don't expect that any type system we can come up with will be
quite precise enough to cover all the cases, so we probably shouldn't lose
too much sleep over this.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20140814/50588737/attachment-0001.html>


More information about the Python-ideas mailing list