[Python-Dev] readd u'' literal support in 3.3?

Fri Dec 9 08:38:05 CET 2011

On Fri, 2011-12-09 at 16:36 +1000, Nick Coghlan wrote:
> On Fri, Dec 9, 2011 at 4:01 PM, Chris McDonough <chrism at plope.com> wrote:
> > On the consumer side, folks who want to run 2.6/2.7/3.3-only codebases
> > will have the wherewithal to compile their own Python 3 (or use a PPA or
> > equivalent) until the distros catch up.
> >
> > So I'm not sure why 3.2 not having support for u'' should be a real
> > blocker for the change.
> 
> If this argument was valid, people wouldn't be so worried about
> maintaining 2.5 compatibility in their libraries. Consider if I tried
> to make this argument to justify everyone dropping 2.5 and earlier
> support today:
> 
> """On the consumer side, folks who want to run 2.6+ codebases on older
> Linux distros have the wherewithal to compile their own more recent
> Python 2 (or use a PPA or
> equivalent) until they can move to a more recent version of their distro."""

Fair point.

That said, personally, I have given up entirely on Python 2.4 and 2.5
support for newer versions of my OSS libraries.  I continue to backport
fixes and (some) features to older library versions so folks can run
those on systems that require older Pythons.  I gave up 2.5 support
fairly recently across everything new, and I gave up support for 2.4 a
year ago or more in new releases with the same intent.

In reality, there is only one major platform that requires 2.4: RHEL 5
and folks who use it will just need to also use old versions of popular
libraries; trying to support it for all future feature work until it's
EOLed is not sane unless someone pays for it.  Python 2.5 has slightly
more compelling platforms (GAE and Jython), but GAE is moving to Python
2.7 and Jython is a bit moribund these days and is not really popular
enough that a critical mass of folks will clamor for new-and-shiny
releases that run on it.

The upshot is that most newly created code only needs to run on Python
2.6 and *some* version of Python 3.  And being able to eventually write
that code in a nonsucky subset of Python 2/3 is important to me, because
I'm going to be developing software in that subset for many years (way
past the timeframe we're talking about in which Python 3.2 will rule the
roost).

> It's simply not true in the general case - people don't maintain 2.4+
> compatibility for fun, they do it because RHEL5 (and CentOS 5, etc)
> are still reasonably common and ship with 2.4 as the system Python. As
> soon as you switch away from the system provided Python, you're
> switching away from the vendors entire pre-packaged Python *stack*,
> not just the interpreter itself. You then have to install (and
> generally build) *everything* for yourself. While that is certainly
> possible these days (and a lot simpler than it used to be), it's still
> not trivial [1].
> 
> Since 3.2 is already quite usable for applications that aren't
> fighting with the "native strings" problem (which seems to be the
> common thread running through the complaints I've heard from web
> framework authors), and with it being included in at least the next
> Ubuntu LTS, current versions of Fedora, Arch, etc, it's going to be
> around for a long time. Ignoring 3.1 is a reasonable option. Ignoring
> 3.2 entirely is unlikely to be viable for anyone that is interested in
> supporting 3.x within the next couple of years - the 3.3 release is at
> least 9 months away, and it's also going to take a while for it to
> make its way into distros after the final release gets published on
> python.org.
> 
> Hence my suggestion: perhaps the problem is the fact that PEP 3.3/WSGI
> 1.0.1 introduced the "native string" concept as a minimalist hack to
> try to get a usable gateway interface in Python 3, and that just
> doesn't work in practice when attempting to straddle 2.x and 3.x
> (because the values WSGI is dealing with aren't really text, they're
> bytes, only *some* of which represent text). Perhaps a PEP 444 based
> model would be less painful and more coherent in the long run?

Possibly.  I was the original author of PEP 444 with help from Armin.
(although it has since been taken up by Alice and I do not support the
updates it has received since then).

A bytes-oriented WSGI-like protocol was always the saner option.  The
native string idea optimized in exactly the wrong place, which was to
make it easy to write WSGI middleware, where you're required to do lots
of textlike manipulation of header values.  The idea of using bytes in
places where PEP 3333 now mandates native strings was rejected because
people were (somewhat justifiably) horrified at what they had to do in
order to attempt treat bytes like strings in this context on Python 3 at
the time.  It has gotten better, but maybe still not better enough to
appease the folks who blocked the idea originally.

But all of that is just arguing with the umpire at this point.
Promoting and getting consensus about a different protocol will hurt a
lot.  PEP 3333 was borne of months of intense periods of arguing and
compromise.  It is the way it is now because everyone was too exhausted
to argue about it any more.  I don't think that has changed much since
it was accepted, and asking folks to go back to that particular drawing
board is unlikely to have promising results.  Folks have already spent
many hours, and lots of money on implementations that the current PEP.
They may hunt us down and murder us one by one. ;-)  PEP 3333, to its
credit, is also remarkably backwards compatible with PEP 333, requiring
very little change in existing Python 2 WSGI implementations, which
helps Python 2 folks a lot.

Given an effective choice between enabling six lines of code in Python
3.3 to support u'' and months of political wrangling and code rewriting,
I'll choose the former any day.  If we were talking about a change to
Python that actually required nontrivial effort, had some sort of
nominal consequence, or had some sort of non-theoretical downside, I'd
be a lot less sanguine about it.  But this is just a no-brainer in the
long term, AFAICT.

- C