[Python-ideas] Code version evolver

Stephen J. Turnbull turnbull.stephen.fw at u.tsukuba.ac.jp
Sun Mar 17 13:30:29 EDT 2019


francismb writes:
 > On 3/15/19 4:54 AM, Stephen J. Turnbull wrote:
 > > What 2to3 does is to handle a lot of automatic conversions, such as
 > > flipping the identifiers from str to bytes and unicode to str.  It was
 > > necessary to have some such tool because of the very large amount of
 > > such menial work needed to change a 2 code base to a 3 code base.  But
 > > even so, there were things that 2to3 couldn't do, and it often exposed
 > > bugs or very poor practice (decode applied to unicode objects, encode
 > > applied to bytes) that had to be reworked by the developer anyway.

 > Very interesting from the 2/3 transition experience point of view. But
 > that's not still the past, IMHO that will be after 2020,... around 2025 :-)

Yeah, I did a lightning talk on that about 3 years ago (whenever the
2020 Olympics was awarded to Tokyo, which was pretty much simultaneous
with the start of the EOL-for-Python-2 clock -- the basic fantasy was
that "Python 2 is the common official business-oriented language of
the Tokyo Olympics and Paralympics", and the punch line was "no gold
for programmers, just job security").

But so what?  My point is that 2to3 development itself is the past.  I
don't think anybody's working on it at all now.  The question you
asked is "we have 2to3, why not 3.Xto3.Y?" and my answer is "here's
why 2to3 was worth the effort, 3.X upgrades are quite different and
it's not worth it".

 > Could one also say that under the line that it *improved* the code? 
 > (by exposing bugs, bad practices) could be a first step to just
 > *flag* those behaviors/changes ?

Probably not.  So many lines of code need to be changed to go from 2
to 3 that most likely the first release after conversion is a pile of
dungbeetles.  Remember, some Python 2 code uses str as more or less
opaque bytes, other code use it as "I don't need no stinkin' Unicode"
text (works fine for monolingual environments with 8-bit encodings,
after all).  So it doesn't even do a great job for 'str' vs 'unicode'
vs 'bytes'.  No automatic conversion could do more than a 50% job for
most medium-size projects, and every line of code changed has some
probability of introducing a bug.  If there were a lot of bugs to
start with, that probability goes up -- and a lot of lines change,
implying a lot of *new* bugs.  It's hard for a syntax-based tool to
find enough old bugs to keep up with the proliferation of new ones.

You really should have given up on this by now.  It's not that it's a
bad idea: 2to3 wasn't just a good idea, it was a necessary idea in its
context.  But the analogy for within-3 upgrades doesn't hold, and it's
not hard to see why it doesn't once you have the basic facts
(conservative policy toward backwards compatibility, even across major
versions).  I could be wrong, but I don't think there's much for you
to learn by prolonging the thread.  Unless you actually code an
upgrade tool yourself -- then you'll learn a *ton*.  That's not my
idea of fun, though. :-)

Steve



More information about the Python-ideas mailing list