
Hi, I would like to discuss on the idea of a code (minor) version evolver/re-writer (or at least a change indicator). Let's see one wants to add a feature on the next version and some small grammar change is needed, then the script upgrades/evolves first the current code and then the new version can be installed/used. Something like:
How hard is that? or what is possible and what not? where should it happen? on the installer? Thanks in advance! --francis

On Mon, 11 Mar 2019 at 20:39, francismb <francismb@email.de> wrote:
That sounds very similar to 2to3, which seemed like a good approach to the Python 2 to Python 3 transition, but fell into disuse because people who have to support multiple versions of Python in their code found it *far* easier to do so with a single codebase that worked with both versions, rather than needing to use a translator. Paul

Hi Paul, On 3/12/19 12:21 AM, Paul Moore wrote:
Trying to keep a single code base for 2/3 seems like a good idea (may be the developer just cannot change to 3 fast due how big the step was) but that also have the limitation on how far you can go using new features. Once you're just on the 3 series couldn't such 2to3 concept also help to speed up ? (due the 'backwards-compatibility issue') Thanks, --francis

francismb writes:
This doesn't work very well: you can't use 3 at all until you can use 3 everywhere. So the evolutionary path is 0. Python 2-only code base 1. Part Python 2-only, part Python 2/3 code base (no new features anywhere, since everything has to run in Python 2 -- may require new test halters for component testing under Python 3, and system test has to wait for step 2) 2. Complete Python 2/3 code base 3a. Users use their preferred Python and developers have 2/3 4ever! (All limitations apply in this case. :-( ) 3b. Project moves to Python 3-only. So what most applications did is branch 2 vs. 3, do almost all new development on 3 (bugfixing on both 2 and 3 of course, and maybe occasionally backporting a few new features to 2), and eventually (often as soon as there's a complete implementation for Python 3!) stop supporting Python 2. Only when there was strong demand for Step 3a (typically for popular libraries) did it make sense to spend effort satisfying the constraints of a 2/3 code base.
Once you're just on the 3 series couldn't such 2to3 concept also help to speed up ? (due the 'backwards-compatibility issue')
Not really. For example, addition of syntax like "async" and "yield" fundamentally changes the meaning of "def", in ways that *could not* be fully emulated in earlier Pythons. The semantics simply were impossible to produce -- that's why syntax extensions were necessary. What 2to3 does is to handle a lot of automatic conversions, such as flipping the identifiers from str to bytes and unicode to str. It was necessary to have some such tool because of the very large amount of such menial work needed to change a 2 code base to a 3 code base. But even so, there were things that 2to3 couldn't do, and it often exposed bugs or very poor practice (decode applied to unicode objects, encode applied to bytes) that had to be reworked by the developer anyway. The thing about "within 3" upgrades is that that kind of project-wide annoyance is going to be minimal, because the language is mostly growing in power, not changing the semantics of existing syntax. Such changes are very rare, and considered extremely carefully for implications for existing code. In a very few cases it's possible to warn about dangerous use of obsolete syntax whose meaning has changed, but that's very rare too. In some cases *pure additions* to the core will be available via "from __future__ import A", which covers many of the cases of "I wish I could use feature A in version X.Y". But this kind of thing is constrained by core developer time, and developing a 3.x to 3.y utility is (IMO, somebody else is welcome to prove me wrong! :-) way past the point of zero marginal returns to developer effort. It's an interesting idea, but I think practically it won't have the benefits you hope for, at least not enough to persuade core developers to work on it. Steve -- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnbull@sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN

On Fri, Mar 15, 2019 at 08:10:58PM +0100, francismb wrote:
No, it is not a backwards compatible change. Any code using async as a name will fail. py> sys.version '3.8.0a2+ (heads/pr_12089:5fcd3b8, Mar 11 2019, 12:39:33) \n[GCC 4.1.2 20080704 (Red Hat 4.1.2-55)]' py> async = 1 File "<stdin>", line 1 async = 1 ^ SyntaxError: invalid syntax -- Steven

On Sat, Mar 16, 2019 at 12:28 PM Steven D'Aprano <steve@pearwood.info> wrote:
Though that particular case is a little complicated. Python 3.4.4 (default, Apr 17 2016, 16:02:33)
Python 3.5.3 (default, Sep 27 2018, 17:25:39) and Python 3.6.5 (default, Apr 1 2018, 05:46:30)
Python 3.7.0a4+ (heads/master:95e4d58913, Jan 27 2018, 06:21:05)
So at what point do you call it a backward-incompatible change? And if you have some sort of automated translation tool to "fix" this, when should it rename something that was called "async"? ChrisA

On 3/15/19 4:54 AM, Stephen J. Turnbull wrote: that's not still the past, IMHO that will be after 2020,... around 2025 :-) Could one also say that under the line that it *improved* the code? (by exposing bugs, bad practices) could be a first step to just *flag* those behaviors/changes ? Regards, --francis

francismb writes:
Very interesting from the 2/3 transition experience point of view. But that's not still the past, IMHO that will be after 2020,... around 2025 :-)
Yeah, I did a lightning talk on that about 3 years ago (whenever the 2020 Olympics was awarded to Tokyo, which was pretty much simultaneous with the start of the EOL-for-Python-2 clock -- the basic fantasy was that "Python 2 is the common official business-oriented language of the Tokyo Olympics and Paralympics", and the punch line was "no gold for programmers, just job security"). But so what? My point is that 2to3 development itself is the past. I don't think anybody's working on it at all now. The question you asked is "we have 2to3, why not 3.Xto3.Y?" and my answer is "here's why 2to3 was worth the effort, 3.X upgrades are quite different and it's not worth it".
Probably not. So many lines of code need to be changed to go from 2 to 3 that most likely the first release after conversion is a pile of dungbeetles. Remember, some Python 2 code uses str as more or less opaque bytes, other code use it as "I don't need no stinkin' Unicode" text (works fine for monolingual environments with 8-bit encodings, after all). So it doesn't even do a great job for 'str' vs 'unicode' vs 'bytes'. No automatic conversion could do more than a 50% job for most medium-size projects, and every line of code changed has some probability of introducing a bug. If there were a lot of bugs to start with, that probability goes up -- and a lot of lines change, implying a lot of *new* bugs. It's hard for a syntax-based tool to find enough old bugs to keep up with the proliferation of new ones. You really should have given up on this by now. It's not that it's a bad idea: 2to3 wasn't just a good idea, it was a necessary idea in its context. But the analogy for within-3 upgrades doesn't hold, and it's not hard to see why it doesn't once you have the basic facts (conservative policy toward backwards compatibility, even across major versions). I could be wrong, but I don't think there's much for you to learn by prolonging the thread. Unless you actually code an upgrade tool yourself -- then you'll learn a *ton*. That's not my idea of fun, though. :-) Steve

On 3/15/19 2:34 PM, francismb wrote:
Translating existing code is a small problem when the language changes backwards-incompatibly. The larger problem is unlearning what I used to know and learning a new language. If that happens enough, I'll stop using a language. One of Python's strengths is that it evolves very slowly, and knowledge I work hard to accumulate now remains useful for a long time.

On Mon, Mar 11, 2019 at 09:38:21PM +0100, francismb wrote:
What you want this "version evolver" to do might be clear to you, but it certainly isn't clear to me. I don't know what you mean by evolving the current code, but I'm guessing you don't mean genetic programming. https://en.wikipedia.org/wiki/Genetic_programming I don't know who you expect is using this: the Python core developers responsible for adding new language features and changing the grammar, or Python programmers. I don't know what part of the current code (current code of *what*?) is supposed to be upgraded or evolved, or what you mean by that. Do you mean using this to add new grammatical features to the interpreter? Do you mean something like 2to3? Something which transforms source code written in Python? https://docs.python.org/2/library/2to3.html
How hard it is to get the behaviour shown? Easy! def python_next(version, ignoreme): x = float(version) return "%.1f" % (x + 0.1) def is_python_code(target, version): return target == version How hard is it to get the behaviour not shown? I have no idea, since I can't guess what these functions do that you don't show. If you want these functions to do more than the behaviour you show, don't expect us to guess what they do.
or what is possible and what not? where should it happen? on the installer?
Absolutely no idea. -- Steven

Hi Steven, On 3/12/19 12:25 AM, Steven D'Aprano wrote: parts that moves source code from the current version to the next if a backwards incompatible grammar change is needed. Python programmers may use the helpers to upgrade to the next version.
Yes a source transformer, but to be applied to some 3.x version to move it to the next 3.x+1, and so on ... (instead of '2to3' a kind of 'nowtonext', aka 'python_next') Couldn't that relax the tension on doing 'backward compatibility changes' a bit ? Thanks, --francis

On Sat, Mar 16, 2019 at 5:43 AM francismb <francismb@email.de> wrote:
People who care about backward compatibility will usually have some definition of what they support, such as "this app will run on any Python version shipped by a currently-supported Debian release" (which at the moment means supporting Python 3.4, shipped by Debian Jessie), or "we support back as far as isn't too much of a pain" (which usually means committing to support everything starting from the version that introduced some crucial feature). Either way, there's not usually a "forever", but potentially quite a few versions' worth of support. The same is true of books that discuss the language, blog posts giving tips and tricks, Stack Overflow answers, and everything else that incorporates code that people might want to copy and paste. What version of Python do you need? What's the oldest that it still works on, and what's the newest before something breaks it? Backward-incompatible changes make that EXTREMELY hard. Backward-compatible changes make it only a little bit harder, as they set a minimum but not a maximum. You want to see how bad it can be? Go try to find out how to do something slightly unusual with React.js. Stack Overflow answers sometimes have three, four, or even more different code blocks, saying "this if you're on this version, that for some other version". ChrisA

On Sat, Mar 16, 2019 at 7:35 AM francismb <francismb@email.de> wrote:
Python 3.5 introduced the modulo operator for bytes objects. How are you going to write a function that determines whether or not a piece of code depends on this? And, are you going to run this function on every single code snippet before you try it? I don't think this is possible, AND it's most definitely not a substitute for backward compatibility. ChrisA

On Mon, Mar 18, 2019 at 01:13:29AM +1100, Chris Angelico wrote: [...]
I don't understand whether your question is asking if Francis *personally* can do this, or if it is possible in principle. If the later, then inferring the type of expressions is precisely the sort of thing that mypy (and others) do. -- Steven

On Mon, Mar 18, 2019 at 9:34 AM Steven D'Aprano <steve@pearwood.info> wrote:
Kinda somewhere between. Francis keeps saying "oh, just make a source code rewriter", and I'm trying to point out that (1) that is NOT an easy thing to do - sure, there are easy cases, but there are also some extremely hard ones; and (2) even if it could magically be made to work, it would still have (and cause) problems. ChrisA

On Thu, Mar 14, 2019 at 09:33:21PM +0100, francismb wrote: [...]
Perhaps, but probably not. The core-developers are already overworked, and don't have time to add all the features we want. Making them responsible for writing this source code transformer for every backwards incompatible change will increase the amount of work they do, not decrease it, and probably make backwards-incompatible changes even less popular. For example: version 3.8 will include a backwards incompatible change made to the statistics.mode function. Currently, mode() raises an exception if the data contains more than one "most frequent" value. Starting from 3.8, it will return the first such value found. If we had to write some sort of source code translator to deal with this change, I doubt that we could automate this. And if we could, writing that translator would probably be *much* more work than making the change itself. Besides, I think it was Paul who pointed this out, in practice we found that 2to3 wasn't as useful as people expected. It turns out that for most people, writing version-independent code that supports the older and newer versions of Python is usually simpler than keeping two versions and using a translator to move from one to the other. But if you feel that this feature may be useful, I encourage you to experiment with writing your own version and putting it on PyPI for others to use. If it is successful, then we could some day bring it into the standard library. -- Steven

On Mon, 11 Mar 2019 at 20:39, francismb <francismb@email.de> wrote:
That sounds very similar to 2to3, which seemed like a good approach to the Python 2 to Python 3 transition, but fell into disuse because people who have to support multiple versions of Python in their code found it *far* easier to do so with a single codebase that worked with both versions, rather than needing to use a translator. Paul

Hi Paul, On 3/12/19 12:21 AM, Paul Moore wrote:
Trying to keep a single code base for 2/3 seems like a good idea (may be the developer just cannot change to 3 fast due how big the step was) but that also have the limitation on how far you can go using new features. Once you're just on the 3 series couldn't such 2to3 concept also help to speed up ? (due the 'backwards-compatibility issue') Thanks, --francis

francismb writes:
This doesn't work very well: you can't use 3 at all until you can use 3 everywhere. So the evolutionary path is 0. Python 2-only code base 1. Part Python 2-only, part Python 2/3 code base (no new features anywhere, since everything has to run in Python 2 -- may require new test halters for component testing under Python 3, and system test has to wait for step 2) 2. Complete Python 2/3 code base 3a. Users use their preferred Python and developers have 2/3 4ever! (All limitations apply in this case. :-( ) 3b. Project moves to Python 3-only. So what most applications did is branch 2 vs. 3, do almost all new development on 3 (bugfixing on both 2 and 3 of course, and maybe occasionally backporting a few new features to 2), and eventually (often as soon as there's a complete implementation for Python 3!) stop supporting Python 2. Only when there was strong demand for Step 3a (typically for popular libraries) did it make sense to spend effort satisfying the constraints of a 2/3 code base.
Once you're just on the 3 series couldn't such 2to3 concept also help to speed up ? (due the 'backwards-compatibility issue')
Not really. For example, addition of syntax like "async" and "yield" fundamentally changes the meaning of "def", in ways that *could not* be fully emulated in earlier Pythons. The semantics simply were impossible to produce -- that's why syntax extensions were necessary. What 2to3 does is to handle a lot of automatic conversions, such as flipping the identifiers from str to bytes and unicode to str. It was necessary to have some such tool because of the very large amount of such menial work needed to change a 2 code base to a 3 code base. But even so, there were things that 2to3 couldn't do, and it often exposed bugs or very poor practice (decode applied to unicode objects, encode applied to bytes) that had to be reworked by the developer anyway. The thing about "within 3" upgrades is that that kind of project-wide annoyance is going to be minimal, because the language is mostly growing in power, not changing the semantics of existing syntax. Such changes are very rare, and considered extremely carefully for implications for existing code. In a very few cases it's possible to warn about dangerous use of obsolete syntax whose meaning has changed, but that's very rare too. In some cases *pure additions* to the core will be available via "from __future__ import A", which covers many of the cases of "I wish I could use feature A in version X.Y". But this kind of thing is constrained by core developer time, and developing a 3.x to 3.y utility is (IMO, somebody else is welcome to prove me wrong! :-) way past the point of zero marginal returns to developer effort. It's an interesting idea, but I think practically it won't have the benefits you hope for, at least not enough to persuade core developers to work on it. Steve -- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnbull@sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN

On Fri, Mar 15, 2019 at 08:10:58PM +0100, francismb wrote:
No, it is not a backwards compatible change. Any code using async as a name will fail. py> sys.version '3.8.0a2+ (heads/pr_12089:5fcd3b8, Mar 11 2019, 12:39:33) \n[GCC 4.1.2 20080704 (Red Hat 4.1.2-55)]' py> async = 1 File "<stdin>", line 1 async = 1 ^ SyntaxError: invalid syntax -- Steven

On Sat, Mar 16, 2019 at 12:28 PM Steven D'Aprano <steve@pearwood.info> wrote:
Though that particular case is a little complicated. Python 3.4.4 (default, Apr 17 2016, 16:02:33)
Python 3.5.3 (default, Sep 27 2018, 17:25:39) and Python 3.6.5 (default, Apr 1 2018, 05:46:30)
Python 3.7.0a4+ (heads/master:95e4d58913, Jan 27 2018, 06:21:05)
So at what point do you call it a backward-incompatible change? And if you have some sort of automated translation tool to "fix" this, when should it rename something that was called "async"? ChrisA

On 3/15/19 4:54 AM, Stephen J. Turnbull wrote: that's not still the past, IMHO that will be after 2020,... around 2025 :-) Could one also say that under the line that it *improved* the code? (by exposing bugs, bad practices) could be a first step to just *flag* those behaviors/changes ? Regards, --francis

francismb writes:
Very interesting from the 2/3 transition experience point of view. But that's not still the past, IMHO that will be after 2020,... around 2025 :-)
Yeah, I did a lightning talk on that about 3 years ago (whenever the 2020 Olympics was awarded to Tokyo, which was pretty much simultaneous with the start of the EOL-for-Python-2 clock -- the basic fantasy was that "Python 2 is the common official business-oriented language of the Tokyo Olympics and Paralympics", and the punch line was "no gold for programmers, just job security"). But so what? My point is that 2to3 development itself is the past. I don't think anybody's working on it at all now. The question you asked is "we have 2to3, why not 3.Xto3.Y?" and my answer is "here's why 2to3 was worth the effort, 3.X upgrades are quite different and it's not worth it".
Probably not. So many lines of code need to be changed to go from 2 to 3 that most likely the first release after conversion is a pile of dungbeetles. Remember, some Python 2 code uses str as more or less opaque bytes, other code use it as "I don't need no stinkin' Unicode" text (works fine for monolingual environments with 8-bit encodings, after all). So it doesn't even do a great job for 'str' vs 'unicode' vs 'bytes'. No automatic conversion could do more than a 50% job for most medium-size projects, and every line of code changed has some probability of introducing a bug. If there were a lot of bugs to start with, that probability goes up -- and a lot of lines change, implying a lot of *new* bugs. It's hard for a syntax-based tool to find enough old bugs to keep up with the proliferation of new ones. You really should have given up on this by now. It's not that it's a bad idea: 2to3 wasn't just a good idea, it was a necessary idea in its context. But the analogy for within-3 upgrades doesn't hold, and it's not hard to see why it doesn't once you have the basic facts (conservative policy toward backwards compatibility, even across major versions). I could be wrong, but I don't think there's much for you to learn by prolonging the thread. Unless you actually code an upgrade tool yourself -- then you'll learn a *ton*. That's not my idea of fun, though. :-) Steve

On 3/15/19 2:34 PM, francismb wrote:
Translating existing code is a small problem when the language changes backwards-incompatibly. The larger problem is unlearning what I used to know and learning a new language. If that happens enough, I'll stop using a language. One of Python's strengths is that it evolves very slowly, and knowledge I work hard to accumulate now remains useful for a long time.

On Mon, Mar 11, 2019 at 09:38:21PM +0100, francismb wrote:
What you want this "version evolver" to do might be clear to you, but it certainly isn't clear to me. I don't know what you mean by evolving the current code, but I'm guessing you don't mean genetic programming. https://en.wikipedia.org/wiki/Genetic_programming I don't know who you expect is using this: the Python core developers responsible for adding new language features and changing the grammar, or Python programmers. I don't know what part of the current code (current code of *what*?) is supposed to be upgraded or evolved, or what you mean by that. Do you mean using this to add new grammatical features to the interpreter? Do you mean something like 2to3? Something which transforms source code written in Python? https://docs.python.org/2/library/2to3.html
How hard it is to get the behaviour shown? Easy! def python_next(version, ignoreme): x = float(version) return "%.1f" % (x + 0.1) def is_python_code(target, version): return target == version How hard is it to get the behaviour not shown? I have no idea, since I can't guess what these functions do that you don't show. If you want these functions to do more than the behaviour you show, don't expect us to guess what they do.
or what is possible and what not? where should it happen? on the installer?
Absolutely no idea. -- Steven

Hi Steven, On 3/12/19 12:25 AM, Steven D'Aprano wrote: parts that moves source code from the current version to the next if a backwards incompatible grammar change is needed. Python programmers may use the helpers to upgrade to the next version.
Yes a source transformer, but to be applied to some 3.x version to move it to the next 3.x+1, and so on ... (instead of '2to3' a kind of 'nowtonext', aka 'python_next') Couldn't that relax the tension on doing 'backward compatibility changes' a bit ? Thanks, --francis

On Sat, Mar 16, 2019 at 5:43 AM francismb <francismb@email.de> wrote:
People who care about backward compatibility will usually have some definition of what they support, such as "this app will run on any Python version shipped by a currently-supported Debian release" (which at the moment means supporting Python 3.4, shipped by Debian Jessie), or "we support back as far as isn't too much of a pain" (which usually means committing to support everything starting from the version that introduced some crucial feature). Either way, there's not usually a "forever", but potentially quite a few versions' worth of support. The same is true of books that discuss the language, blog posts giving tips and tricks, Stack Overflow answers, and everything else that incorporates code that people might want to copy and paste. What version of Python do you need? What's the oldest that it still works on, and what's the newest before something breaks it? Backward-incompatible changes make that EXTREMELY hard. Backward-compatible changes make it only a little bit harder, as they set a minimum but not a maximum. You want to see how bad it can be? Go try to find out how to do something slightly unusual with React.js. Stack Overflow answers sometimes have three, four, or even more different code blocks, saying "this if you're on this version, that for some other version". ChrisA

On Sat, Mar 16, 2019 at 7:35 AM francismb <francismb@email.de> wrote:
Python 3.5 introduced the modulo operator for bytes objects. How are you going to write a function that determines whether or not a piece of code depends on this? And, are you going to run this function on every single code snippet before you try it? I don't think this is possible, AND it's most definitely not a substitute for backward compatibility. ChrisA

On Mon, Mar 18, 2019 at 01:13:29AM +1100, Chris Angelico wrote: [...]
I don't understand whether your question is asking if Francis *personally* can do this, or if it is possible in principle. If the later, then inferring the type of expressions is precisely the sort of thing that mypy (and others) do. -- Steven

On Mon, Mar 18, 2019 at 9:34 AM Steven D'Aprano <steve@pearwood.info> wrote:
Kinda somewhere between. Francis keeps saying "oh, just make a source code rewriter", and I'm trying to point out that (1) that is NOT an easy thing to do - sure, there are easy cases, but there are also some extremely hard ones; and (2) even if it could magically be made to work, it would still have (and cause) problems. ChrisA

On Thu, Mar 14, 2019 at 09:33:21PM +0100, francismb wrote: [...]
Perhaps, but probably not. The core-developers are already overworked, and don't have time to add all the features we want. Making them responsible for writing this source code transformer for every backwards incompatible change will increase the amount of work they do, not decrease it, and probably make backwards-incompatible changes even less popular. For example: version 3.8 will include a backwards incompatible change made to the statistics.mode function. Currently, mode() raises an exception if the data contains more than one "most frequent" value. Starting from 3.8, it will return the first such value found. If we had to write some sort of source code translator to deal with this change, I doubt that we could automate this. And if we could, writing that translator would probably be *much* more work than making the change itself. Besides, I think it was Paul who pointed this out, in practice we found that 2to3 wasn't as useful as people expected. It turns out that for most people, writing version-independent code that supports the older and newer versions of Python is usually simpler than keeping two versions and using a translator to move from one to the other. But if you feel that this feature may be useful, I encourage you to experiment with writing your own version and putting it on PyPI for others to use. If it is successful, then we could some day bring it into the standard library. -- Steven
participants (8)
-
Chris Angelico
-
Dan Sommers
-
francismb
-
Greg Ewing
-
Paul Moore
-
Rémi Lapeyre
-
Stephen J. Turnbull
-
Steven D'Aprano