[Python-Dev] Migration from Python 2.7 and bytes formatting

Neil Schemenauer nas at arctrix.com
Fri Jan 17 21:09:45 CET 2014


I've refined this idea a little in my latest PEP 461 patch (issue
20284).  Continuing to use %s instead of introducing %b seems
better.  I've called the commmand-line option -2, it could be used
to enable other similar porting aids.

I'd like to try porting code making use of the -2 feature to see how
helpful it is.  The behavior is partway between Python 2.x laziness
and Python 3.x strictness in terms of specifying encodings.

Python 2.x:

    - coerce byte strings to unicode strings to avoid making a
      decision about encoding

    - when writing a unicode string to a bytes stream without
      a specified encoding, encode with ASCII.  Blow up with an
      exception if a non-ASCII character is encounted, often far
      from where the real bug is.

Python 3.x:

    - refuse to accept unicode strings where bytes are expected,
      require explicit encoding to be preformed

Python 3.x with -2 command-line option:

    - when objects are formatted into bytes, immediately
      encode them using strict ASCII encoding.

No code would be considered fully ported to Python 3 unless it can
run without the -2 command line option.

  Neil



More information about the Python-Dev mailing list