Python 2 to 3 conversion - embrace the pain

INADA Naoki songofacandy at gmail.com
Mon Mar 16 15:41:11 EDT 2015


On Tue, Mar 17, 2015 at 2:47 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 3/16/2015 5:13 AM, INADA Naoki wrote:
>
>> Another experience is porting Flask application in my company from
>> Python 2 to Python 3.
>> It has 26k lines of code and 7.6k lines of tests.
>>
>> Since we don't need to support both of PY2 and PY3, we used 2to3.
>> 2to3 changes 740 lines.
>
>
> That is less than 3% of the lines.  Were any changes incorrect?  How many
> lines *not* flagged by 2to3 needed change?

All changes are OK. Flask (and Werkzeug) handles most part of pain.
Application using Flask uses unicode most everywhere on Python 2 already.

Few changes 2to3 can't handle is like this:

-    reader = DictReader(open(file_path, 'r'), delimiter='\t')
+    reader = DictReader(open(file_path, 'r', encoding='utf-8'), delimiter='\t')

Since csv module in Python 2 doesn't support unicode, we had to parse
csv as bytestring.
And our server doesn't have utf-8 locale, we should specify encoding
explicitly on PY3.
There were few (less than 10, maybe) easy trouble like this.


>
>> I had to replace google-api-client with
>> requests+oauthlib since
>> it had not supported PY3 yet.
>
>
> Other than those needed for this change, which 2to3 could not anticipate or
> handle?
>
>> After that, we encountered few trouble with untested code. But Porting
>> effort is surprisingly small.
>> We're happy now with Python 3.  We can write non-ascii string to log
>> without fear of UnicodeError.
>> We can use csv with unicode without hack.
>
>
> People who use ascii only or perhaps one encoding everywhere severely
> underestimate the benefit of unicode strings (and utf-8) everywhere.


I agree. We may lost log easily on Python 2. It makes investigating bug hard.

    >>> import logging
    >>> logging.error("%s %s", u'こんにちは', 'こんにちは')
    Traceback (most recent call last):
    ...
      File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/__init__.py",
line 335, in getMessage
        msg = msg % self.args
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in
position 0: ordinal not in range(128)
    Logged from file <stdin>, line 1


And log including unicode is hard to read.

    >>> logging.error("%s", [u'こんにちは'])
    ERROR:root:[u'\u3053\u3093\u306b\u3061\u306f']


Python 3 makes our development faster and easier.

Since old Python programmers knows how to avoid pitfalls in Python 2,
writing Python 2 is not a pain.
But when teaching Python to PHP programmer, teaching tons of pitfalls is pain.
This is why I think new applications should start with Python 3.

>
>> Porting *modern* *application* code to *PY3 only* is easy, while
>> porting libraries on the edge of
>> bytes/unicode like google-api-client to PY2/3 is not easy.
>>
>> I think application developers should use *only* Python 3 from this year.
>> If we start moving, more library developers will be able to start
>> writing Python 3 only code from next year.
>
>
> --
> Terry Jan Reedy
>
> --
> https://mail.python.org/mailman/listinfo/python-list



--
INADA Naoki  <songofacandy at gmail.com>



More information about the Python-list mailing list