[Python-Dev] PEP 414 - Unicode Literals for Python 3

Chris McDonough chrism at plope.com
Mon Feb 27 22:16:39 CET 2012


On Mon, 2012-02-27 at 21:03 +0000, Vinay Sajip wrote:
> Chris McDonough <chrism <at> plope.com> writes:
> 
> > I really don't know how long I'll need to do future development in the
> > subset language of Python 2 and Python 3 because I can't predict the
> > future.  It could be two years, it might be five.  Who knows.
> > 
> > But I do know that I'm going to be developing in the subset of Python
> > that currently runs on Python 2 >= 2.6 and Python 3 >= 3.2 for at least
> > a year.  And that will suck, because that language is a much less fun
> > language in which to develop than either Python 2 or Python 3.  Frankly,
> > it's a pretty bad language.
> 
> What exactly is it that makes it so bad? Since you're developing for >= 2.6,
> what stops you from using "from __future__ import unicode_literals" and 'xxx'
> for text and b'yyy' for bytes? Then you would be working in essentially Python
> 3.x, at least as far as string literals go. The conversion time will be very
> small compared to the year time-frame you're talking about.
> 
> > If we make this change now, it means a year from now I'll be able to
> > develop in a slightly less sucky subset language if I choose to drop
> > support for 3.2.  And people who don't try to support Python 3 at all
> > til then will never have to program in the suckiest subset like I will
> > have had to.
> 
> And if we don't make the change now and you change your code to use
> unicode_literals, convert u'xxx' -> 'xxx' and then change the places where you
> really meant to use bytes, that'll be a one-off change after which you will be
> working on a common codebase which works on 2.6+ and 3.0+, and as far as string
> literals are concerned you'll be working in the hopefully non-sucky 3.x syntax.
> 
> > Note that u'' literals are sort of the tip of the iceberg here;
> > supporting them will obviously not make development under the subset an
> > order of magnitude less sucky, just a tiny little bit less sucky.  There
> > are other extremely annoying things, like str(bytes) returning the repr
> > of a bytestring on Python 3.  That's almost as irritating as the absence
> > of u'' literals, but we have to evaluate one thing at a time.
> 
> Yes, but making a backward step like reintroducing u'' just to make things a
> tiny little bit sucky doesn't seem to me to be worth it, because then >= 3.3 is
> different to 3.2 and earlier. Armin's suggestion of an install-time fixer is
> analogous to running 2to3 after every change, if you're trying to support 3.2
> and 3.3+ at the same time, isn't it? You can't just edit-and-test, which to me
> is the main benefit of a single codebase.

The downsides of a unicode_literals future import are spelled out in the
PEP:

http://www.python.org/dev/peps/pep-0414/#rationale-and-goals

- C




More information about the Python-Dev mailing list