[Python-ideas] Type hints for text/binary data in Python 2+3 code
Andrey Vlasovskikh
andrey.vlasovskikh at gmail.com
Thu Mar 24 22:06:11 EDT 2016
> 2016-03-25, в 4:07, Andrew Barnert <abarnert at yahoo.com> написал(а):
>
> On Mar 24, 2016, at 17:00, Andrey Vlasovskikh <andrey.vlasovskikh at gmail.com> wrote:
>>
>> The only problematic conversions that may result in errors are `Text` to `str`
>> and vice versa in Python 2.
>
> So any time you use Text strings together with strings from sys.argv, sys.stdin/raw_input(), os.listdir(), ZipFile, csv.reader, etc., all of which are native str, they'll pass as valid in a 2+3 test, even though they're not actually valid in 2.x?
Yes, these errors will go unnoticed, unfortunately. But this guarantees that there will be no false positive warnings related to text/binary types. And a model of text/binary types that matches the runtime semantics is easier for users. This kind of errors would have been more important to find if users had been expected to port their code from Python 3 back to Python 2 more often than from 2 to 3.
Speaking of ways to actually find these errors, one idea discussed in the issue tracker of Mypy [1] was to have a separate _AsciiStr type for things that are certainly ASCII-compatible. However, treating all str values as non-ASCII by default would result in false positive warnings. We could have a reverse type, say, _NonAsciiStr (there should be a better name for that) not compatible with Text for things we know are non-ASCII for sure:
* Non-ASCII str literals
* Functions like those you mentioned above
There will be false negatives in cases not covered by _NonAsciiStr, but at least there will be a way of documenting non-ASCII native str interfaces for the users who care about this kind of Python 2 errors. The downside is that _NonAsciiStr is harder to understand and apply correctly than str.
[1]: https://github.com/python/typing/issues/19
--
Andrey Vlasovskikh
Web: http://pirx.ru/
More information about the Python-ideas
mailing list