[Python-ideas] Type hints for text/binary data in Python 2+3 code
Andrey Vlasovskikh
andrey.vlasovskikh at gmail.com
Tue Mar 22 18:18:30 EDT 2016
> 2016-03-19, в 21:51, Guido van Rossum <guido at python.org> написал(а):
>
> I like the way this is going. I think it needs to be a separate PEP;
> PEP 484 is already too long and this topic deserves being written up
> carefully (like you have done here).
I would like to experiment with various text/binary types for Python 2 and 3 for some time before coming up with a PEP about it. And I would like everybody interested in 2/3 compatible type hints join the discussion. My perspective (mostly PyCharm-specific) might be a bit narrow here.
> * Do we really need _AsciiUnicode? I see the point of _AsciiStr,
> because Python 2 accepts 'x' + u'' but fails '\xff' + u'', so 'x'
> needs to be of type _AsciiStr while '\xff' should not (it should be
> just str). However there's no difference in how u'x' is treated from
> how u'\u1234' or u'\xff' are treated -- none of them can be
> concatenated to '\xff' and all of them can be concatenated to _'x'.
I was concerned with UnicodeEncodeErrors in Python 2 during implicit conversions from unicode to bytes:
getattr(obj, u'Non-ASCII-name')
There are several places in the Python 2 API where these ASCII-based unicode->bytes conversions take place, so the _AsciiUnicode type comes to mind.
> * It would be helpful to spell out exactly what is and isn't allowed
> when different core types (bytes, str, unicode, Text) meet in Python 2
> and in Python 3. Something like a table with a row and a column for
> each and the type of x+y (or "error") in each of the cells.
Agreed. I'll try to come up with specific rules for handling text/binary types (bytes, str, unicode, Text, _Ascii*) in Python 2 and 3. For me the rules for dealing with _Ascii* look the most problematic at the moment as it's unclear how these types should propagate via text-handling functions.
> * I propose that Python 2+3 mode is just the intersection of what
> Python 2 and Python 3 mode allow. (In mypy, I don't think we'll
> implement this -- users will just have to run mypy twice with and
> without --py2. But for PyCharm it makes sense to be able to declare
> this. Yet I think it would be good not to have to spell out separately
> which rules it uses, defining it as the intersection of 2 and 3 is all
> we need.
Yes, there is no need in having a specific 2+3 mode, I was really referring to the intersection of the Python 2 and 3 APIs when the user accesses a text / binary method not available in both.
--
Andrey Vlasovskikh
Web: http://pirx.ru/
More information about the Python-ideas
mailing list