On Jan 22, 2016, at 10:37, Brett Cannon <brett@python.org> wrote:

On Thu, 21 Jan 2016 at 10:45 Guido van Rossum <guido@python.org> wrote:

Yes, this is a useful thing to discuss.

Maybe we can standardize on the types defined by the 'six' package, which is commonly used for 2-3 straddling code:

six.text_type (unicode in PY2, str in PY3)
six.binary_type (str in PY2, bytes in PY3)

Actually for the latter we might as well use bytes.

I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in Python 3.

As for the textual type, I say either `text` or `unicode` since they are both unambiguous between Python 2 and 3 and get the point across.

The only problem is that, while bytes is a builtin type in both 2.7 and 3.x, with similar behavior (especially in 3.5, where simple %-formatting code works the same as in 2.7), unicode exists in 2.x but not 3.x, so that would require people writing something like "try: unicode except: unicode=str" at the top of every file (or monkeypatching builtins somewhere) for the annotations to actually be valid 3.x code. And, if you're going to do that, using something that's already wide-spread and as close to a de facto standard as possible, like the six type suggested by Guido, seems less disruptive than inventing a new standard (even if "text" or "unicode" is a little nicer than "six.text_type").

(Or, of course, Guido could just get in his time machine and, along with restoring the u string literal prefix in 3.3, also restore the builtin name unicode as a synonym for str, and then this whole mail thread would fade out like Marty McFly.)

Also, don't forget "basestring", which some 2.x code uses. A lot of such code just drops bytes support when modernizing, but if not, it has to change to something that means basestring or str|unicode in 2.x and bytes|str in 3.x. Again, six has a solution for that, string_types, and mypy could standardize on that solution too.

And does `str` represent the type for the specific version of Python mypy is running under, or is it pegged to a specific representation across Python 2 and 3? If it's the former then fine,

In six-based code, it means native string, and there are tools designed to help you go over all your str uses and decide which ones should be changed to something else (usually text_type or binary_type), but no special name to use when you decide "I really do want native str here". So, I think it makes sense for mypy to assume the same, rather than to encourage people to shadow or rebind str to make mypy happy in 2.x.

Speaking of native strings: six code often doesn't use native strings for __str__, instead using explicit text, and the @python_2_unicode_compatible class decorator. Will mypy need special support for that decorator to handle those types? If so, it's probably worth adding; otherwise, it would be encouraging people to stick with native strings instead of switching to text.