Re: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7

22 Jan 2016

      On 22 January 2016 at 19:08, Guido van Rossum  wrote:
...
On Fri, Jan 22, 2016 at 10:37 AM, Brett Cannon  wrote:
...
On Thu, 21 Jan 2016 at 10:45 Guido van Rossum  wrote:
...
On Thu, Jan 21, 2016 at 10:14 AM, Agustín Herranz Cecilia
 wrote:
[...]
Yes, this is no related with the choice of syntax for annotations
directly. This is intended to help in the process of porting python2 code to
python3, and it's outside of the PEP scope but related to the original
problem. What I have in mind is some type aliases so you could annotate a
version specific type to avoid ambiguousness on code that it's used on
different versions. At the end what I originally try to said is that it's
good to have a convention way to name this type aliases.
Yes, this is a useful thing to discuss.
Maybe we can standardize on the types defined by the 'six' package, which
is commonly used for 2-3 straddling code:
six.text_type (unicode in PY2, str in PY3)
six.binary_type (str in PY2, bytes in PY3)
Actually for the latter we might as well use bytes.
I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in
Python 3.
OK, that's settled.
...
As for the textual type, I say either `text` or `unicode` since they are
both unambiguous between Python 2 and 3 and get the point across.
Then let's call it unicode. I suppose we can add this to typing.py. In PY2,
typing.unicode is just the built-in unicode. In PY3, it's the built-in str.
This thread came to my attention just as I'd been thinking about a
related point.

For me, by far the worst Unicode-related porting issue I see is people
with a confused view of what type of data reading a file will give.
This is because open() returns a different type (byte stream or
character stream) depending on its arguments (specifically 'b' in the
mode) and it's frustratingly difficult to track this type across
function calls - especially in code originally written in a Python 2
environment where people *expect* to confuse bytes and strings in this
context. So, for example, I see a function read_one_byte which does
f.read(1), and works fine in real use when a data file (opened with
'b') is processed, but fails when sys.stdin us used (on Python 3once
someone types a Unicode character).

As far as I know, there's no way for type annotations to capture this
distinction - either as they are at present in Python3, nor as being
discussed here. But what I'm not sure of is whether it's something
that *could* be tracked by a type checker. Of course I'm also not sure
I'm right when I say you can't do it right now :-)

Is this something worth including in the discussion, or is it a
completely separate topic?
Paul

Re: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to support Python 2.7

Paul Moore