Request for CPython 3.5.3 release
Long story short, I've discovered that asyncio is broken in 3.5.2. Specifically, there is a callbacks race in `loop.sock_connect` which can make subsequent `loop.sock_sendall` calls to hang forever. This thing is very tricky and hard to detect and debug; I had to spend a few hours investigating what's going on with a failing unittest in uvloop (asyncio-compatible event loop). I can only imagine how hard it would be to understand what's going on in a larger codebase. For those who is interested, here's a PR for asyncio repo: https://github.com/python/asyncio/pull/366 It explains the bug in detail and there has a proposed patch to fix the problem. Larry and the release team: would it be possible to make an "emergency" 3.5.3 release? Going forward, we need to increase the number of functional tests for asyncio, as most of the current tests use mocks. I'm going to port all functional tests from uvloop to asyncio as a start. Yury
On 06/28/2016 02:05 PM, Yury Selivanov wrote:
Long story short, I've discovered that asyncio is broken in 3.5.2. Specifically, there is a callbacks race in `loop.sock_connect` which can make subsequent `loop.sock_sendall` calls to hang forever. This thing is very tricky and hard to detect and debug; I had to spend a few hours investigating what's going on with a failing unittest in uvloop (asyncio-compatible event loop). I can only imagine how hard it would be to understand what's going on in a larger codebase.
For those who is interested, here's a PR for asyncio repo: https://github.com/python/asyncio/pull/366 It explains the bug in detail and there has a proposed patch to fix the problem.
Larry and the release team: would it be possible to make an "emergency" 3.5.3 release?
I've looped in the rest of the 3.5 release team. By the way, I don't know why you Cc'd Nick and Brett. While they're fine fellows, they aren't on the release team, and they aren't involved in these sorts of decisions. //arry/
On Jun 29, 2016, at 12:42 AM, Larry Hastings <larry@hastings.org> wrote:
By the way, I don't know why you Cc'd Nick and Brett. While they're fine fellows, they aren't on the release team, and they aren't involved in these sorts of decisions.
We're all involved in those sort of decisions. Raymond
On 06/28/2016 04:23 PM, Raymond Hettinger wrote:
On Jun 29, 2016, at 12:42 AM, Larry Hastings <larry@hastings.org> wrote:
By the way, I don't know why you Cc'd Nick and Brett. While they're fine fellows, they aren't on the release team, and they aren't involved in these sorts of decisions. We're all involved in those sort of decisions.
Raymond
Perhaps, but that would make the Cc: list intractably long. //arry/
On 06/28/2016 02:51 PM, Larry Hastings wrote:
On 06/28/2016 02:05 PM, Yury Selivanov wrote:
Larry and the release team: would it be possible to make an "emergency" 3.5.3 release?
I'd like to hear from the other asyncio reviewers: is this bug bad enough to merit such an "emergency" release?
Thanks,
//arry/
There has been a distinct lack of "dear god yes Larry" emails so far. This absence suggests that, no, it is not a bad enough bug to merit such a release. If we stay to our usual schedule, I expect 3.5.3 to ship December-ish. //arry/
Hi everybody, I fully understand that AsyncIO is a drop in the ocean of CPython, you're working to prepare the entire 3.5.3 release for December, not yet ready. However, you might create a 3.5.2.1 release with only this AsyncIO fix ? PEP 440 doesn't seem to forbid that even if I see only 3 digits examples in PEP, I only find an example with 4 digits: https://www.python.org/dev/peps/pep-0440/#version-specifiers If 3.5.2.1 or 3.5.3 are impossible to release before december, what are the alternative solutions for AsyncIO users ? 1. Use 3.5.1 and hope that Linux distributions won't use 3.5.2 ? 2. Patch by hand asyncio source code ? 3. Remove asyncio folder in CPython, and install asyncio via github repository ? 4. Anything else ? To be honest, I'm migrating an AsyncIO application from 3.4.3 to 3.5.1 with more than 10 000 lines of code, I'm really interested in to know if it's better to keep 3.4.3 for now, or if 3.5 branch is enough stable ? Have a nice week-end. -- Ludovic Gasc (GMLudo) http://www.gmludo.eu/ 2016-06-30 9:41 GMT+02:00 Larry Hastings <larry@hastings.org>:
On 06/28/2016 02:51 PM, Larry Hastings wrote:
On 06/28/2016 02:05 PM, Yury Selivanov wrote:
Larry and the release team: would it be possible to make an "emergency" 3.5.3 release?
I'd like to hear from the other asyncio reviewers: is this bug bad enough to merit such an "emergency" release?
Thanks,
*/arry*
There has been a distinct lack of "dear god yes Larry" emails so far. This absence suggests that, no, it is not a bad enough bug to merit such a release.
If we stay to our usual schedule, I expect 3.5.3 to ship December-ish.
*/arry*
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/gmludo%40gmail.com
On 2 July 2016 at 16:17, Ludovic Gasc <gmludo@gmail.com> wrote:
Hi everybody,
I fully understand that AsyncIO is a drop in the ocean of CPython, you're working to prepare the entire 3.5.3 release for December, not yet ready. However, you might create a 3.5.2.1 release with only this AsyncIO fix ?
That would be more work than just doing a 3.5.3 release, though - the problem isn't with the version number bump, it's with asking the release team to do additional work without clearly explaining the rationale for the request (more on that below). While some parts of the release process are automated, there's still a lot of steps to run through by a number of different people: https://www.python.org/dev/peps/pep-0101/. The first key question to answer in this kind of situation is: "Is there code that will run correctly on 3.5.1 that will now fail on 3.5.2?" (i.e. it's a regression introduced by the asyncio and coroutine changes in the point release rather than something that was already broken in 3.5.0 and 3.5.1). If the answer is "No", then it doesn't inhibit the 3.5.2 rollout in any way, and folks can wait until 3.5.3 for the fix. However, if the answer is "Yes, it's a new regression in 3.5.2" (as in this case), then the next question becomes "Is there an agreed resolution for the regression?" The answer to that is currently "No" - Yury's PR against the asyncio repo is still being discussed. Once the answer to that question is "Yes", *then* the question of releasing a high priority fix in a Python 3.5.3 release can be properly considered by answering the question "Of the folks using asyncio, what proportion of them are likely to encounter problems in upgrading to Python 3.5.2, and is there a workaround they can apply or alternate approach they can use to avoid the problem?". At the moment, Yury's explanation of the fix in the PR is (understandably) addressed at getting the problem resolved within the context of asyncio, and hence just describes the particular APIs affected, and the details of the incorrect behaviour. While that's an important step in the process, it doesn't provide a clear assessment of the *consequences* of the bug aimed at folks that aren't themselves deeply immersed in using asyncio, so we can't tell if the problem is "Some idiomatic code frequently recommended in user facing examples and used in third party asyncio based libraries may hang client processes" (which would weigh in favour of an early 3.5.3 release before people start encountering the regression in practice) or "Some low level API's not recommended for general use may hang if used in a particular non-idiomatic combination only likely to be encountered by event loop implementors" (which would suggest it may be OK to stick with the normal maintenance release cadence).
If 3.5.2.1 or 3.5.3 are impossible to release before december,
Early maintenance releases are definitely possible, but the consequences of particular regressions need to be put into terms that make sense to the release team, which generally means stepping up from "APIs X, Y, and Z broke in this way" to "Users doing A, B, and C will be affected in this way". As an example of a case where an early maintenance release took place: several years ago, Python 2.6.3 happened to break both "from logging import *" (due to a missing entry in test___all__ letting an error in logging.__all__ through) and building extension modules with setuptools (due to a change in a private API that setuptools was monkeypatching). Those were considered significant enough for the 2.6.4 release to happen early.
what are the alternative solutions for AsyncIO users ? 1. Use 3.5.1 and hope that Linux distributions won't use 3.5.2 ?
Linux distributions have mechanisms to carry patches (indeed, selective application of patches is one of the main benefits of using system packages over upstream ones), so any distro that rebases on 3.5.2 can be encouraged to add the fix once it lands regardless of whether or not Larry approves an additional maintenance release outside the normal cadence. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Hi Nick, First, thanks a lot for your detailed answer, it was very instructive to me. My answers below. 2016-07-03 6:09 GMT+02:00 Nick Coghlan <ncoghlan@gmail.com>:
On 2 July 2016 at 16:17, Ludovic Gasc <gmludo@gmail.com> wrote:
Hi everybody,
I fully understand that AsyncIO is a drop in the ocean of CPython, you're working to prepare the entire 3.5.3 release for December, not yet ready. However, you might create a 3.5.2.1 release with only this AsyncIO fix ?
That would be more work than just doing a 3.5.3 release, though - the problem isn't with the version number bump, it's with asking the release team to do additional work without clearly explaining the rationale for the request (more on that below). While some parts of the release process are automated, there's still a lot of steps to run through by a number of different people: https://www.python.org/dev/peps/pep-0101/.
Thanks for the link, I didn't know this PEP, it was interesting to read.
The first key question to answer in this kind of situation is: "Is there code that will run correctly on 3.5.1 that will now fail on 3.5.2?" (i.e. it's a regression introduced by the asyncio and coroutine changes in the point release rather than something that was already broken in 3.5.0 and 3.5.1).
If the answer is "No", then it doesn't inhibit the 3.5.2 rollout in any way, and folks can wait until 3.5.3 for the fix.
However, if the answer is "Yes, it's a new regression in 3.5.2" (as in this case), then the next question becomes "Is there an agreed resolution for the regression?"
The answer to that is currently "No" - Yury's PR against the asyncio repo is still being discussed.
Once the answer to that question is "Yes", *then* the question of releasing a high priority fix in a Python 3.5.3 release can be properly considered by answering the question "Of the folks using asyncio, what proportion of them are likely to encounter problems in upgrading to Python 3.5.2, and is there a workaround they can apply or alternate approach they can use to avoid the problem?".
At the moment, Yury's explanation of the fix in the PR is (understandably) addressed at getting the problem resolved within the context of asyncio, and hence just describes the particular APIs affected, and the details of the incorrect behaviour. While that's an important step in the process, it doesn't provide a clear assessment of the *consequences* of the bug aimed at folks that aren't themselves deeply immersed in using asyncio, so we can't tell if the problem is "Some idiomatic code frequently recommended in user facing examples and used in third party asyncio based libraries may hang client processes" (which would weigh in favour of an early 3.5.3 release before people start encountering the regression in practice) or "Some low level API's not recommended for general use may hang if used in a particular non-idiomatic combination only likely to be encountered by event loop implementors" (which would suggest it may be OK to stick with the normal maintenance release cadence).
To my basic understanding, it seems to have race conditions to open sockets. If my understanding is true, it's a little bit the heart of AsyncIO is affected ;-) If you search about loop.sock_connect in Github, you've found a lot of results https://github.com/search?l=python&q=loop.sock_connect&ref=searchresults&type=Code&utf8=%E2%9C%93 Moreover, if Yury, one of contributors of AsyncIO: https://github.com/python/asyncio/graphs/contributors and uvloop creator has sent an e-mail about that, I'm tented to believe him. It's why a little bit scared by that, even if we don't have a lot of AsyncIO's users, especially with the latest release. However, Google Trends might give us a good overview of relative users we have, compare to Twisted, Gevent and Tornado: https://www.google.com/trends/explore#q=asyncio%2C%20%2Fm%2F02xknvd%2C%20gevent%2C%20%2Fm%2F07s58h4&date=1%2F2016%2012m&cmpt=q&tz=Etc%2FGMT-2
If 3.5.2.1 or 3.5.3 are impossible to release before december,
Early maintenance releases are definitely possible, but the consequences of particular regressions need to be put into terms that make sense to the release team, which generally means stepping up from "APIs X, Y, and Z broke in this way" to "Users doing A, B, and C will be affected in this way".
As an example of a case where an early maintenance release took place: several years ago, Python 2.6.3 happened to break both "from logging import *" (due to a missing entry in test___all__ letting an error in logging.__all__ through) and building extension modules with setuptools (due to a change in a private API that setuptools was monkeypatching). Those were considered significant enough for the 2.6.4 release to happen early.
Ok, we'll see first what's the decision will emerge about this pull request in AsyncIO.
what are the alternative solutions for AsyncIO users ? 1. Use 3.5.1 and hope that Linux distributions won't use 3.5.2 ?
Linux distributions have mechanisms to carry patches (indeed, selective application of patches is one of the main benefits of using system packages over upstream ones), so any distro that rebases on 3.5.2 can be encouraged to add the fix once it lands regardless of whether or not Larry approves an additional maintenance release outside the normal cadence.
Good to know. It means that it should be more Mac and Windows users who are concerned about this bug, especially new comers, because they download directly from python.org website. Depends on the pull request decision, it might be also a warning message on downloads page to explain to use 3.5.1 instead of 3.5.2 if you want to use AsyncIO. Cheers,
Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Another thought recently occurred to me. Do releases really have to be such big productions? A recent ACM article by Tom Limoncelli[1] reminded me that we're doing releases the old-fashioned way -- infrequently, and with lots of manual labor. Maybe we could (eventually) try to strive for a lighter-weight, more automated release process? It would be less work, and it would reduce stress for authors of stdlib modules and packages -- there's always the next release. I would think this wouldn't obviate the need for carefully planned and timed "big deal" feature releases, but it could make the bug fix releases *less* of a deal, for everyone. [1] http://cacm.acm.org/magazines/2016/7/204027-the-small-batches-principle/abst... (sadly requires login) -- --Guido van Rossum (python.org/~guido)
Many of our users prefer stability (the sort who plan operating system updates years in advance), but generally I'm in favor of more frequent releases. It will likely require more complex branching though, presumably based on the LTS model everyone else uses. One thing we've discussed before is separating core and stdlib releases. I'd be really interested to see a release where most of the stdlib is just preinstalled (and upgradeable) PyPI packages. We can pin versions/bundle wheels for stable releases and provide a fast track via pip to update individual packages. Probably no better opportunity to make such a fundamental change as we move to a new VCS... Cheers, Steve Top-posted from my Windows Phone -----Original Message----- From: "Guido van Rossum" <guido@python.org> Sent: 7/3/2016 7:42 To: "Python-Dev" <python-dev@python.org> Cc: "Nick Coghlan" <ncoghlan@gmail.com> Subject: Re: [Python-Dev] Request for CPython 3.5.3 release Another thought recently occurred to me. Do releases really have to be such big productions? A recent ACM article by Tom Limoncelli[1] reminded me that we're doing releases the old-fashioned way -- infrequently, and with lots of manual labor. Maybe we could (eventually) try to strive for a lighter-weight, more automated release process? It would be less work, and it would reduce stress for authors of stdlib modules and packages -- there's always the next release. I would think this wouldn't obviate the need for carefully planned and timed "big deal" feature releases, but it could make the bug fix releases *less* of a deal, for everyone. [1] http://cacm.acm.org/magazines/2016/7/204027-the-small-batches-principle/abst... (sadly requires login) -- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org
[forking the conversation since the subject has shifted] On Sun, 3 Jul 2016 at 09:50 Steve Dower <steve.dower@python.org> wrote:
Many of our users prefer stability (the sort who plan operating system updates years in advance), but generally I'm in favour of more frequent releases.
So there's our 18 month cadence for feature/minor releases, and then there's the 6 month cadence for bug-fix/micro releases. At the language summit there was the discussion kicked off by Ned about our release schedule and a group of us had a discussion afterward where a more strict release cadence of 12 months with the release date tied to a consistent month -- e.g. September of every year -- instead of our hand-wavy "about 18 months after the last feature release"; people in the discussion seemed to like the 12 months consistency idea. I think making releases on a regular, annual schedule requires simply a decision by us to do it since the time scale we are talking about is still so large it shouldn't impact the workload of RMs & friends *that* much (I think). As for upping the bug-fix release cadence, if we can automate that then perhaps we can up the frequency (maybe once every quarter), but I'm not sure what kind of overhead that would add and thus how much would need to be automated to make that release cadence work. Doing this kind of shrunken cadence for bug-fix releases would require the RM & friends to decide what would need to be automated to shrink the release schedule to make it viable (e.g. "if we automated steps N & M of the release process then I would be okay releasing every 3 months instead of 6"). For me, I say we shift to an annual feature release in a specific month every year, and switch to a quarterly bug-fix releases only if we can add zero extra work to RMs & friends.
It will likely require more complex branching though, presumably based on the LTS model everyone else uses.
Why is that? You can almost view our feature releases as LTS releases, at which point our current branching structure is no different.
One thing we've discussed before is separating core and stdlib releases. I'd be really interested to see a release where most of the stdlib is just preinstalled (and upgradeable) PyPI packages. We can pin versions/bundle wheels for stable releases and provide a fast track via pip to update individual packages.
Probably no better opportunity to make such a fundamental change as we move to a new VCS...
<deep breath /> Topic 1 ======= If we separate out the stdlib, we first need to answer why we are doing this? The arguments supporting this idea is (1) it might simplify more frequent releases of Python (but that's a guess), (2) it would make the stdlib less CPython-dependent (if purely by the fact of perception and ease of testing using CI against other interpreters when they have matching version support), and (3) it might make it easier for us to get more contributors who are comfortable helping with just the stdlib vs CPython itself (once again, this might simply be through perception). So if we really wanted to go this route of breaking out the stdlib, I think we have two options. One is to have the cpython repo represent the CPython interpreter and then have a separate stdlib repo. The other option is to still have cpython represent the interpreter but then each stdlib module have their own repository. Since the single repo for the stdlib is not that crazy, I'll talk about the crazier N repo idea (in all scenarios we would probably have a repo that pulled in cpython and the stdlib through either git submodules or subtrees and that would represent a CPython release repo). In this scenario, having each module/package have its own repo could get us a couple of things. One is that it might help simplify module maintenance by allowing each module to have its own issue tracker, set of contributors, etc. This also means it will make it obvious what modules are being neglected which will either draw attention and get help or honestly lead to a deprecation if no one is willing to help maintain it. Separate repos would also allow for easier backport releases (e.g. what asyncio and typing have been doing since they were created). If a module is maintained as if it was its own project then it makes it easier to make releases separated from the stdlib itself (although the usefulness is minimized as long as sys.path has site-packages as its last entry). Separate releases allows for faster releases of the stand-alone module, e.g. if only asyncio has a bug then asyncio can cut their own release and the rest of the stdlib doesn't need to care. Then when a new CPython release is done we can simply bundle up the stable release at the moment and essentially make our mythical sumo release be the stdlib release itself (and this would help stop modules like asyncio and typing from simply copying modules into the stdlib from their external repo if we just pulled in their repo using submodules or subtrees in a master repo). And yes, I realize this might lead to a ton of repos, but maybe that's an important side effect. We have so much code in our stdlib that it's hard to maintain and fixes can get dropped on the floor. If this causes us to re-prioritize what should be in the stdlib and trim it back to things we consider critical to have in all Python releases, then IMO that's as a huge win in maintainability and workload savings instead of carrying forward neglected code (or at least help people focus on modules they care about and let others know where help is truly needed). Topic 2 ======= Independent releases of the stdlib could be done, although if we break the stdlib up into individual repos then it shifts the conversation as individual modules could simply do their own releases independent of the big stdlib release. Personally I don't see a point of doing a stdlib release separate from CPython, but I could see doing a more frequent release of CPython where the only thing that changed is the stdlib itself (but I don't know if that would even alleviate the RM workload). For me, I'm more interested in thinking about breaking the stdlib modules into their own repos and making a CPython release more of a collection of python-dev-approved modules that are maintained under the python organization on GitHub and follow our compatibility guidelines and code quality along with the CPython interpreter. This would also make it much easier for custom distros, e.g. a cloud-targeted CPython release that ignored all GUI libraries. -Brett
Cheers, Steve
Top-posted from my Windows Phone ------------------------------ From: Guido van Rossum <guido@python.org> Sent: 7/3/2016 7:42 To: Python-Dev <python-dev@python.org> Cc: Nick Coghlan <ncoghlan@gmail.com> Subject: Re: [Python-Dev] Request for CPython 3.5.3 release
Another thought recently occurred to me. Do releases really have to be such big productions? A recent ACM article by Tom Limoncelli[1] reminded me that we're doing releases the old-fashioned way -- infrequently, and with lots of manual labor. Maybe we could (eventually) try to strive for a lighter-weight, more automated release process? It would be less work, and it would reduce stress for authors of stdlib modules and packages -- there's always the next release. I would think this wouldn't obviate the need for carefully planned and timed "big deal" feature releases, but it could make the bug fix releases *less* of a deal, for everyone.
[1] http://cacm.acm.org/magazines/2016/7/204027-the-small-batches-principle/abst... (sadly requires login)
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org
On 3 July 2016 at 21:22, Brett Cannon <brett@python.org> wrote:
Topic 2 ======= Independent releases of the stdlib could be done, although if we break the stdlib up into individual repos then it shifts the conversation as individual modules could simply do their own releases independent of the big stdlib release. Personally I don't see a point of doing a stdlib release separate from CPython, but I could see doing a more frequent release of CPython where the only thing that changed is the stdlib itself (but I don't know if that would even alleviate the RM workload).
The one major downside of independent stdlib releases is that it significantly increases the number of permutations of things 3rd parties have to support. It can be hard enough to get a user to report the version of Python they are having an issue with - to get them to report both python and stdlib version would be even trickier. And testing against all the combinations, and deciding which combinations are supported, becomes a much bigger problem. Furthermore, pip/setuptools are just getting to the point of allowing for dependencies conditional on Python version. If independent stdlib releases were introduced, we'd need to implement dependencies based on stdlib version as well - consider depending on a backport of a new module if the user has an older stdlib version that doesn't include it. Changing the principle that the CPython version is a well-defined label for a specific language level and stdlib, is a major change with very wide implications, and I don't see sufficient benefits to justify it. On the other hand, simply decoupling the internal development cycles for the language and the stdlib (or independent stdlib modules), without adding extra "release" cycles, is not that big a deal - in many ways, we do that already with projects like asyncio. Paul
On Sun, Jul 3, 2016, 13:43 Paul Moore <p.f.moore@gmail.com> wrote:
Topic 2 ======= Independent releases of the stdlib could be done, although if we break
On 3 July 2016 at 21:22, Brett Cannon <brett@python.org> wrote: the
stdlib up into individual repos then it shifts the conversation as individual modules could simply do their own releases independent of the big stdlib release. Personally I don't see a point of doing a stdlib release separate from CPython, but I could see doing a more frequent release of CPython where the only thing that changed is the stdlib itself (but I don't know if that would even alleviate the RM workload).
The one major downside of independent stdlib releases is that it significantly increases the number of permutations of things 3rd parties have to support. It can be hard enough to get a user to report the version of Python they are having an issue with - to get them to report both python and stdlib version would be even trickier. And testing against all the combinations, and deciding which combinations are supported, becomes a much bigger problem.
Furthermore, pip/setuptools are just getting to the point of allowing for dependencies conditional on Python version. If independent stdlib releases were introduced, we'd need to implement dependencies based on stdlib version as well - consider depending on a backport of a new module if the user has an older stdlib version that doesn't include it.
Changing the principle that the CPython version is a well-defined label for a specific language level and stdlib, is a major change with very wide implications, and I don't see sufficient benefits to justify it. On the other hand, simply decoupling the internal development cycles for the language and the stdlib (or independent stdlib modules), without adding extra "release" cycles, is not that big a deal - in many ways, we do that already with projects like asyncio.
This last bit is what I would advocate if we broke the stdlib out unless an emergency patch release is warranted for a specific module (e.g. like asyncio that started this discussion). Obviously backporting is its own thing. -Brett
Paul
On 3 July 2016 at 22:04, Brett Cannon <brett@python.org> wrote:
This last bit is what I would advocate if we broke the stdlib out unless an emergency patch release is warranted for a specific module (e.g. like asyncio that started this discussion). Obviously backporting is its own thing.
It's also worth noting that pip has no mechanism for installing an updated stdlib module, as everything goes into site-packages, and the stdlib takes precedence over site-packages unless you get into sys.path hacking abominations like setuptools uses (or at least used to use, I don't know if it still does). So as things stand, independent patch releases of stdlib modules would need to be manually copied into place. Allowing users to override the stdlib opens up a different can of worms - not necessarily one that we couldn't resolve, but IIRC, it was always a deliberate policy that overriding the stdlib wasn't possible (that's why backports have names like unittest2...) Paul
On Sun, Jul 3, 2016, 14:22 Paul Moore <p.f.moore@gmail.com> wrote:
On 3 July 2016 at 22:04, Brett Cannon <brett@python.org> wrote:
This last bit is what I would advocate if we broke the stdlib out unless an emergency patch release is warranted for a specific module (e.g. like asyncio that started this discussion). Obviously backporting is its own thing.
It's also worth noting that pip has no mechanism for installing an updated stdlib module, as everything goes into site-packages, and the stdlib takes precedence over site-packages unless you get into sys.path hacking abominations like setuptools uses (or at least used to use, I don't know if it still does). So as things stand, independent patch releases of stdlib modules would need to be manually copied into place.
I thought I mentioned this depends on changing sys.path; sorry if I didn't.
Allowing users to override the stdlib opens up a different can of worms - not necessarily one that we couldn't resolve, but IIRC, it was always a deliberate policy that overriding the stdlib wasn't possible (that's why backports have names like unittest2...)
I think it could be considered less of an issue now thanks to being able to declare dependencies and the version requirements for pip. -brett
Paul
My thinking on this issue was that some/most packages from the stdlib would move into site-packages. Certainly I'd expect asyncio to be in this category, and probably typing. Even going as far as email and urllib would potentially be beneficial (to those packages, is my thinking). Obviously not every single module can do this, but there are plenty that aren't low-level dependencies for other modules that could. Depending on particular versions of these then becomes a case of adding normal package version constraints - we could even bundle version information for non-updateable packages so that installs fail on incompatible Python versions. The "Uber repository" could be a requirements.txt that pulls down wheels for the selected stable versions of each package so that we still distribute all the same code with the same stability, but users have much more ability to patch their own stdlib after install. (FWIW, we use a system similar to this at Microsoft for building Visual Studio, so I can vouch that it works on much more complicated software than Python.) Cheers, Steve Top-posted from my Windows Phone -----Original Message----- From: "Paul Moore" <p.f.moore@gmail.com> Sent: 7/3/2016 14:23 To: "Brett Cannon" <brett@python.org> Cc: "Guido van Rossum" <guido@python.org>; "Nick Coghlan" <ncoghlan@gmail.com>; "Python-Dev" <python-dev@python.org>; "Steve Dower" <steve.dower@python.org> Subject: Re: [Python-Dev] release cadence (was: Request for CPython 3.5.3 release) On 3 July 2016 at 22:04, Brett Cannon <brett@python.org> wrote:
This last bit is what I would advocate if we broke the stdlib out unless an emergency patch release is warranted for a specific module (e.g. like asyncio that started this discussion). Obviously backporting is its own thing.
It's also worth noting that pip has no mechanism for installing an updated stdlib module, as everything goes into site-packages, and the stdlib takes precedence over site-packages unless you get into sys.path hacking abominations like setuptools uses (or at least used to use, I don't know if it still does). So as things stand, independent patch releases of stdlib modules would need to be manually copied into place. Allowing users to override the stdlib opens up a different can of worms - not necessarily one that we couldn't resolve, but IIRC, it was always a deliberate policy that overriding the stdlib wasn't possible (that's why backports have names like unittest2...) Paul
On 07/04/2016 12:19 AM, Steve Dower wrote:
My thinking on this issue was that some/most packages from the stdlib would move into site-packages. Certainly I'd expect asyncio to be in this category, and probably typing. Even going as far as email and urllib would potentially be beneficial (to those packages, is my thinking).
Obviously not every single module can do this, but there are plenty that aren't low-level dependencies for other modules that could. Depending on particular versions of these then becomes a case of adding normal package version constraints - we could even bundle version information for non-updateable packages so that installs fail on incompatible Python versions.
The "Uber repository" could be a requirements.txt that pulls down wheels for the selected stable versions of each package so that we still distribute all the same code with the same stability, but users have much more ability to patch their own stdlib after install.
(FWIW, we use a system similar to this at Microsoft for building Visual Studio, so I can vouch that it works on much more complicated software than Python.)
While we're on the subject, I'd like to offer another point for consideration: not all implementations of Python can provide the full stdlib, and not everyone wants the full stdlib. For MicroPython, most of Python's batteries are too heavy. Tkinter on Android is probably not useful enough for people to port it. Weakref can't be emulated nicely in Javascript. If packages had a way to opt-out of needing the whole standard library, and instead specify the stdlib subset they need, answering questions like "will this run on my phone?" and "what piece of the stdlib do we want to port next?" would be easier. Both Debian and Fedora package some parts of the stdlib separately (tkinter, venv, tests), and have opt-in subsets of the stdlib for minimal systems (python-minimal, system-python). Tools like pyinstaller run magic heuristics to determine what parts of stdlib can be left out. It would help these projects if the "not all of stdlib is installed" case was handled more systematically at the CPython or distutils level. As I said on the Language summit, this is just talk; I don't currently have the resources to drive this effort. But if anyone is thinking of splitting the stdlib, please keep these points in mind as well. I think that, at least, if "pip install -U asyncio" becomes possible, "pip uninstall --yes-i-know-what-im-doing asyncio" should be possible as well.
From: Paul Moore <mailto:p.f.moore@gmail.com> Sent: 7/3/2016 14:23 To: Brett Cannon <mailto:brett@python.org> Cc: Guido van Rossum <mailto:guido@python.org>; Nick Coghlan <mailto:ncoghlan@gmail.com>; Python-Dev <mailto:python-dev@python.org>; Steve Dower <mailto:steve.dower@python.org> Subject: Re: [Python-Dev] release cadence (was: Request for CPython 3.5.3 release)
On 3 July 2016 at 22:04, Brett Cannon <brett@python.org> wrote:
This last bit is what I would advocate if we broke the stdlib out unless an emergency patch release is warranted for a specific module (e.g. like asyncio that started this discussion). Obviously backporting is its own thing.
It's also worth noting that pip has no mechanism for installing an updated stdlib module, as everything goes into site-packages, and the stdlib takes precedence over site-packages unless you get into sys.path hacking abominations like setuptools uses (or at least used to use, I don't know if it still does). So as things stand, independent patch releases of stdlib modules would need to be manually copied into place.
Allowing users to override the stdlib opens up a different can of worms - not necessarily one that we couldn't resolve, but IIRC, it was always a deliberate policy that overriding the stdlib wasn't possible (that's why backports have names like unittest2...)
On Tue, Jul 5, 2016 at 5:53 PM, Petr Viktorin <encukou@gmail.com> wrote:
If packages had a way to opt-out of needing the whole standard library, and instead specify the stdlib subset they need, answering questions like "will this run on my phone?" and "what piece of the stdlib do we want to port next?" would be easier.
On the flip side, answering questions like "what version of Python do people need to run my program" become harder to answer, particularly if you have third-party dependencies. (The latest version of numpy might decide that it's going to 'import statistics', for instance.) One of the arguments against splitting the stdlib was that corporate approval for software is often hard to obtain, and it's much easier to say "I need approval to use Python, exactly as distributed by python.org" than "I need approval to use Python-core plus these five Python-stdlib sections". ChrisA
On 07/05/2016 10:05 AM, Chris Angelico wrote:
On Tue, Jul 5, 2016 at 5:53 PM, Petr Viktorin <encukou@gmail.com> wrote:
If packages had a way to opt-out of needing the whole standard library, and instead specify the stdlib subset they need, answering questions like "will this run on my phone?" and "what piece of the stdlib do we want to port next?" would be easier.
On the flip side, answering questions like "what version of Python do people need to run my program" become harder to answer, particularly if you have third-party dependencies. (The latest version of numpy might decide that it's going to 'import statistics', for instance.)
That question is already hard to answer. How do you tell if a library works on Micropython? Or Python for Android? I'm not arguing to change the default; if the next version of numpy doesn't do anything, nothing should change. However, under the status quo, "Python 3.4" means "CPython 3.4 with the full stdlib, otherwise all bets are off", and there's no good way to opt in to more granularity.
One of the arguments against splitting the stdlib was that corporate approval for software is often hard to obtain, and it's much easier to say "I need approval to use Python, exactly as distributed by python.org" than "I need approval to use Python-core plus these five Python-stdlib sections".
I'm not arguing against "Python, exactly as distributed by python.org" not including all of stdlib. I would like making stripped-down variants of CPython easier, and to make it possible to opt-in to use CPython without all of stdlib, so that major problems with stdlib availablility in other Python implementations can be caught early. Basically, instead of projects getting commits like "Add metadata for one flavor of Android packaging tool", I'd like to see "Add common metadata for Android, IPhone, PyInstaller, and minimal Linux, and make sure the CPython-based CI smoke-tests that metadata". Also, I believe corporate approval for python.org's Python is a bit of a red herring – usually you'd get approval for Python distributed by Continuum or Red Hat or Canonical or some such. As a Red Hat employee, I can say that what I'm suggesting won't make me suffer, and I see no reason it would hurt the others either.
Hi, asyncio is a good example because it wants to evolve faster than the whole "CPython package". Each minor version of CPython adds news features in asyncio. It is not always easy to document these changes. Moreover, the behaviour of some functions also changed in minor versions. asyncio doesn't respect the trend of semantic versions http://semver.org/ The major version should change if the behaviour of an existing function changes. 2016-07-05 10:05 GMT+02:00 Chris Angelico <rosuav@gmail.com>:
On the flip side, answering questions like "what version of Python do people need to run my program" become harder to answer, particularly if you have third-party dependencies. (The latest version of numpy might decide that it's going to 'import statistics', for instance.)
Recently, I wrote a "perf" module and I wanted to use the statistics module. I was surprised to see that PyPI has a package called "statistics" which just works on Python 2.7. In practice, I can use statistics on Python 2.7, 3.4 and even 3.2 (but I didn't try, this version is too old). It's a matter of describing correctly dependencies. pip supports a requirements.txt file which is a nice may to declare dependency. You can: * specify the minimum library version * make some library specific to some operation systems * skip dependencies on some Python versions -- very helpful for libraries parts of Python 3 stdlib (like statistics) => see Environment markers for conditions on dependencies For perf, I'm using this setup() option in setup.py: 'install_requires': ["statistics; python_version < '3.4'", "six"],
One of the arguments against splitting the stdlib was that corporate approval for software is often hard to obtain, and it's much easier to say "I need approval to use Python, exactly as distributed by python.org" than "I need approval to use Python-core plus these five Python-stdlib sections".
*If* someone wants to split the stdlib into smaller parts and/or move it out of CPython, you should already start to write a PEP. Or you will have to reply to the same questions over and over ;-) Is there a volunteer to write such PEP? Victor
On Jul 05, 2016, at 11:19 AM, Victor Stinner wrote:
pip supports a requirements.txt file which is a nice may to declare dependency. You can:
* specify the minimum library version * make some library specific to some operation systems * skip dependencies on some Python versions -- very helpful for libraries parts of Python 3 stdlib (like statistics)
Interestingly enough, I'm working on a project where we *have* to use packages from the Ubuntu archive, even if there are different (or differently fixed) versions on PyPI. I don't think there's a way to map a requirements.txt into distro package versions and do the install from the distro package manager, but that might be useful. Cheers, -Barry
On Tue, Jul 05, 2016 at 09:53:24AM +0200, Petr Viktorin wrote:
While we're on the subject, I'd like to offer another point for consideration: not all implementations of Python can provide the full stdlib, and not everyone wants the full stdlib.
For MicroPython, most of Python's batteries are too heavy. Tkinter on Android is probably not useful enough for people to port it. Weakref can't be emulated nicely in Javascript. If packages had a way to opt-out of needing the whole standard library, and instead specify the stdlib subset they need, answering questions like "will this run on my phone?" and "what piece of the stdlib do we want to port next?" would be easier.
I don't know that they will be easier. That seems pretty counter- intuitive to me. At the moment, answering these questions are really easy if you use nothing but the std lib: the answer is, if you can install Python, it will work. As soon as you start using non-stdlib modules, the question becomes: - have you installed Python? have you installed module X? and module Y? and module Z? do they match the version of the interpreter? where did you get them from? are you missing dependencies? I can't tell you how much trouble I've had trying to get tkinter working on some Fedora systems because they split tkinter into a separate package. Sure, if I had *known* that it was split into a separate package, then just running `yum install packagename` would (probably!) have worked, but how was I supposed to know? It's not documented anywhere that I could find. I ended up avoiding the Fedora packages and installing from source. I think there comes a time in every successful organisation that they risk losing sight of what made them successful in the first place. (And, yes, I'm aware that the *other* way that successful organisations lose their way is by failing to change with the times.) Yes, we're all probably sick and tired of hearing all the Chicken Little scare stories about how the GIL is killing Python, how everyone is abandoning Python for Ruby/Javascript/Go/Swift, how Python 3 is killing Python, etc. But sometimes the sky does fall. For many people, Python's single biggest advantage until now has been "batteries included", and I think that changing that is risky and shouldn't be done lightly. It's easy to say "just use pip", but if you've ever been stuck behind a corporate firewall where pip doesn't work, or where dowloading and installing software is a firing offence, then you might think differently. If you've had to teach a room full of 14 year olds, and you spend the entire lesson just helping them to install one library, you might have a different view. The other extreme is Javascript/Node.js, where the "just use pip" (or npm in this case) philosophy has been taken to such extremes that one developer practically brought down the entire Node.js ecosystem by withdrawing an eleven line module, left-pad, in a fit of pique. Being open source, the damage was routed around quite quickly, but still, I think it's a good cautionary example of how a technological advance can transform a programming culture to the worse. -- Steve
On 5 July 2016 at 18:02, Steven D'Aprano <steve@pearwood.info> wrote:
Yes, we're all probably sick and tired of hearing all the Chicken Little scare stories about how the GIL is killing Python, how everyone is abandoning Python for Ruby/Javascript/Go/Swift, how Python 3 is killing Python, etc. But sometimes the sky does fall. For many people, Python's single biggest advantage until now has been "batteries included", and I think that changing that is risky and shouldn't be done lightly.
+1 To be fair, I don't think anyone is looking at this "lightly", but I do think it's easy to underestimate the value of "batteries included", and the people it's *most* useful for are precisely the people who aren't involved in any of the Python mailing lists. They just want to get on with things, and "it came with the language" is a *huge* selling point. Internal changes in how we manage the stdlib modules are fine. But changing what the end user sees as "python" is a much bigger deal. Paul
On 05Jul2016 1021, Paul Moore wrote:
On 5 July 2016 at 18:02, Steven D'Aprano <steve@pearwood.info> wrote:
Yes, we're all probably sick and tired of hearing all the Chicken Little scare stories about how the GIL is killing Python, how everyone is abandoning Python for Ruby/Javascript/Go/Swift, how Python 3 is killing Python, etc. But sometimes the sky does fall. For many people, Python's single biggest advantage until now has been "batteries included", and I think that changing that is risky and shouldn't be done lightly.
+1
To be fair, I don't think anyone is looking at this "lightly", but I do think it's easy to underestimate the value of "batteries included", and the people it's *most* useful for are precisely the people who aren't involved in any of the Python mailing lists. They just want to get on with things, and "it came with the language" is a *huge* selling point.
Internal changes in how we manage the stdlib modules are fine. But changing what the end user sees as "python" is a much bigger deal.
Also +1 on this - a default install of Python should continue to include everything it currently does. My interest in changing anything at all is to provide options for end-users/distributors to either reduce the footprint (which they already do), to more quickly update specific modules, and perhaps long-term to make user's code be less tied to a particular Python version (instead being tied to, for example, a specific asyncio version that can be brought into a range of supported Python versions). Batteries included is a big deal. Cheers, Steve
(sorry if you get this twice, I dropped python-dev by mistake) On 07/05/2016 07:02 PM, Steven D'Aprano wrote:
On Tue, Jul 05, 2016 at 09:53:24AM +0200, Petr Viktorin wrote:
While we're on the subject, I'd like to offer another point for consideration: not all implementations of Python can provide the full stdlib, and not everyone wants the full stdlib.
For MicroPython, most of Python's batteries are too heavy. Tkinter on Android is probably not useful enough for people to port it. Weakref can't be emulated nicely in Javascript. If packages had a way to opt-out of needing the whole standard library, and instead specify the stdlib subset they need, answering questions like "will this run on my phone?" and "what piece of the stdlib do we want to port next?" would be easier.
I don't know that they will be easier. That seems pretty counter- intuitive to me. At the moment, answering these questions are really easy if you use nothing but the std lib: the answer is, if you can install Python, it will work. As soon as you start using non-stdlib modules, the question becomes:
- have you installed Python? have you installed module X? and module Y? and module Z? do they match the version of the interpreter? where did you get them from? are you missing dependencies?
I can't tell you how much trouble I've had trying to get tkinter working on some Fedora systems because they split tkinter into a separate package. Sure, if I had *known* that it was split into a separate package, then just running `yum install packagename` would (probably!) have worked, but how was I supposed to know? It's not documented anywhere that I could find. I ended up avoiding the Fedora packages and installing from source.
Ah, but successfully installing from source doesn't always give you the full stdlib either: Python build finished successfully! The necessary bits to build these optional modules were not found: _sqlite3 To find the necessary bits, look in setup.py in detect_modules() for the module's name. I have missed that message before, and wondered pretty hard why the module wasn't there. In the tkinter case, compiling for source is easy on a developer's computer, but doing that on a headless server brings in devel files for the entire graphical environment. Are you saying Python on servers should have a way to do turtle graphics, otherwise it's not Python?
I think there comes a time in every successful organisation that they risk losing sight of what made them successful in the first place. (And, yes, I'm aware that the *other* way that successful organisations lose their way is by failing to change with the times.)
Yes, we're all probably sick and tired of hearing all the Chicken Little scare stories about how the GIL is killing Python, how everyone is abandoning Python for Ruby/Javascript/Go/Swift, how Python 3 is killing Python, etc. But sometimes the sky does fall. For many people, Python's single biggest advantage until now has been "batteries included", and I think that changing that is risky and shouldn't be done lightly.
It's easy to say "just use pip", but if you've ever been stuck behind a corporate firewall where pip doesn't work, or where dowloading and installing software is a firing offence, then you might think differently. If you've had to teach a room full of 14 year olds, and you spend the entire lesson just helping them to install one library, you might have a different view.
That is why I'm *not* arguing for shipping an incomplete stdlib in official Python releases. I fully agree that including batteries is great – I'm just not a fan of welding the battery to the product. There are people who want to cut out what they don't need – to build self-contained applications (pyinstaller, Python for Android), or to eliminate unnecessary dependencies (python3-tkinter). And they will do it with CPython's blessing or without. I don't think Python can move to the mobile world of self-contained apps without this problem being solved, one way or another. It would be much better for the ecosystem if CPython acknowledges this and sets some rules (like "here's how you can do it, but don't call the result an unqualified Python").
The other extreme is Javascript/Node.js, where the "just use pip" (or npm in this case) philosophy has been taken to such extremes that one developer practically brought down the entire Node.js ecosystem by withdrawing an eleven line module, left-pad, in a fit of pique.
Being open source, the damage was routed around quite quickly, but still, I think it's a good cautionary example of how a technological advance can transform a programming culture to the worse.
I don't understand the analogy. Should the eleven-line module have been in Node's stdlib? Outside of stdlib, people are doing this.
On 5 July 2016 at 19:01, Petr Viktorin <encukou@gmail.com> wrote:
There are people who want to cut out what they don't need – to build self-contained applications (pyinstaller, Python for Android), or to eliminate unnecessary dependencies (python3-tkinter). And they will do it with CPython's blessing or without. [...] It would be much better for the ecosystem if CPython acknowledges this and sets some rules (like "here's how you can do it, but don't call the result an unqualified Python").
That doesn't sound unreasonable in principle. As a baseline, I guess the current policy is essentially: """ You can build your own Python installation with whatever parts of the stdlib omitted that you please. However, if you do this, you accept responsibility for any consequences, in terms of 3rd-party modules not working, or even stdlib breakage (for example, we don't guarantee that parts of the stdlib may not rely on other parts). """ That's pretty simple, both to state and to adhere to. And it's basically the current reality. What I'm not clear about is what *additional* guarantees people want to make, and how we'd make them. First of all, Python's packaging ecosystem has no way to express "this package won't work if pathlib isn't present in your stdlib". It seems to me that without something like that, it's pretty hard to support anything better than the current position with regard to 3rd party modules. Documenting stdlib inter-dependencies may be possible, but I wouldn't like to make "X doesn't depend on Y" a guarantee that's subject to backward compatibility rules. Maybe the suggestion is to provide better tools for people wanting to *build* such stripped down versions? That might be a possibility, I guess. I don't know much about how people build their own copies of Python to be able to comment. So I guess my question is, what is the actual proposal here? People seem to have concerns over things that aren't actually being proposed - but without knowing what *is* being proposed, it's hard to avoid that. Paul
On Tue, 5 Jul 2016 at 13:02 Paul Moore <p.f.moore@gmail.com> wrote:
On 5 July 2016 at 19:01, Petr Viktorin <encukou@gmail.com> wrote:
There are people who want to cut out what they don't need – to build self-contained applications (pyinstaller, Python for Android), or to eliminate unnecessary dependencies (python3-tkinter). And they will do it with CPython's blessing or without. [...] It would be much better for the ecosystem if CPython acknowledges this and sets some rules (like "here's how you can do it, but don't call the result an unqualified Python").
That doesn't sound unreasonable in principle. As a baseline, I guess the current policy is essentially:
""" You can build your own Python installation with whatever parts of the stdlib omitted that you please. However, if you do this, you accept responsibility for any consequences, in terms of 3rd-party modules not working, or even stdlib breakage (for example, we don't guarantee that parts of the stdlib may not rely on other parts). """
That's pretty simple, both to state and to adhere to. And it's basically the current reality. What I'm not clear about is what *additional* guarantees people want to make, and how we'd make them. First of all, Python's packaging ecosystem has no way to express "this package won't work if pathlib isn't present in your stdlib". It seems to me that without something like that, it's pretty hard to support anything better than the current position with regard to 3rd party modules. Documenting stdlib inter-dependencies may be possible, but I wouldn't like to make "X doesn't depend on Y" a guarantee that's subject to backward compatibility rules.
Maybe the suggestion is to provide better tools for people wanting to *build* such stripped down versions? That might be a possibility, I guess. I don't know much about how people build their own copies of Python to be able to comment.
So I guess my question is, what is the actual proposal here? People seem to have concerns over things that aren't actually being proposed - but without knowing what *is* being proposed, it's hard to avoid that.
Realizing that all of these are just proposals with no solid plan behind them, they are all predicated on moving to GitHub, and none of these are directly promoting releasing every module in the stdlib on PyPI as a stand-alone package with its own versioning, they are: 1. Break the stdlib out from CPython and have it be a stand-alone repo 2. Break the stdlib up into a bunch of independent repos that when viewed together make up the stdlib (Steve Dower did some back-of-envelope grouping and pegged the # of repos at ~50)
On 05Jul2016 1404, Brett Cannon wrote:
Realizing that all of these are just proposals with no solid plan behind them, they are all predicated on moving to GitHub, and none of these are directly promoting releasing every module in the stdlib on PyPI as a stand-alone package with its own versioning, they are:
1. Break the stdlib out from CPython and have it be a stand-alone repo 2. Break the stdlib up into a bunch of independent repos that when viewed together make up the stdlib (Steve Dower did some back-of-envelope grouping and pegged the # of repos at ~50)
Actually, I was meaning to directly promote releasing each module on PyPI as a standalone package :) "Each module" here is at most the ~50 I counted (though likely many many fewer), which sounds intimidating until you realise that there are virtually no cross-dependencies between them and most would only depend on the core stdlib (which would *not* be a package on PyPI - you get most of Lib/*.py with the core install and it's fixed, while much of Lib/**/* is separately updateable). Take email as an example. It's external dependencies (found by grep for "import") are abc, base64, calendar, datetime, functools, imghdr, os, quopri, sndhdr, socket, time, urllib.parse, uu, warnings. IMHO, only urllib has the slightest chance of being non-fixed here (remembering that "non-fixed" means upgradeable from PyPI, not that it's missing). The circular references (email<->urllib) would probably need to be resolved, but I think many would see that as a good cleanliness step anyway. A quick glance suggests that both email and urllib are only using each other's public APIs, which means that any version of the other package is sufficient. An official release picks the two latest designated stable releases (this is where I'm imagining a requirements.txt-like file in the core repository) and bundles them both, and then users can update either package on its own. If email makes a change that requires a particular change to urllib, it adds a version constraint, which will force both users and the core requirements.txt file to update both together (this is probably why the circular references would need breaking...) Done with care and even incrementally (e.g. just the provisional modules at first), I don't think this is that scary a proposition. It's not strictly predicated on moving to github or to many repositories, but if we did decide to make a drastic change to the repository layout (which I think this requires at a minimum, at least for our own sanity), doing it while migrating VCS at least keeps all the disruption together. Cheers, Steve
On 6 July 2016 at 07:04, Brett Cannon <brett@python.org> wrote:
Realizing that all of these are just proposals with no solid plan behind them, they are all predicated on moving to GitHub, and none of these are directly promoting releasing every module in the stdlib on PyPI as a stand-alone package with its own versioning, they are:
1. Break the stdlib out from CPython and have it be a stand-alone repo 2. Break the stdlib up into a bunch of independent repos that when viewed together make up the stdlib (Steve Dower did some back-of-envelope grouping and pegged the # of repos at ~50)
3. Keep everything in the main CPython repo, but add a "Bundled" subdirectory of independently releasable multi-version compatible subprojects that we move some Lib/* components to. I think one of our goals here should be that "./configure && make && make altinstall" continues to get you a full Python installation for the relevant version. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Tue, 5 Jul 2016 at 18:16 Nick Coghlan <ncoghlan@gmail.com> wrote:
On 6 July 2016 at 07:04, Brett Cannon <brett@python.org> wrote:
Realizing that all of these are just proposals with no solid plan behind them, they are all predicated on moving to GitHub, and none of these are directly promoting releasing every module in the stdlib on PyPI as a stand-alone package with its own versioning, they are:
1. Break the stdlib out from CPython and have it be a stand-alone repo 2. Break the stdlib up into a bunch of independent repos that when viewed together make up the stdlib (Steve Dower did some back-of-envelope grouping and pegged the # of repos at ~50)
3. Keep everything in the main CPython repo, but add a "Bundled" subdirectory of independently releasable multi-version compatible subprojects that we move some Lib/* components to.
That's basically what Steve is proposing.
I think one of our goals here should be that "./configure && make && make altinstall" continues to get you a full Python installation for the relevant version.
I don't think anyone is suggesting otherwise. You just might have to do `git clone --recursive` to get a full-fledged CPython checkout w/ stdlib.
On Tue, Jul 05, 2016 at 08:01:43PM +0200, Petr Viktorin wrote:
In the tkinter case, compiling for source is easy on a developer's computer, but doing that on a headless server brings in devel files for the entire graphical environment. Are you saying Python on servers should have a way to do turtle graphics, otherwise it's not Python?
That's a really good question. I don't think we have an exact answer to "What counts as Python?". It's not like EMCAScript (Javascript) or C where there's a standard that defines the language and standard modules. We just have some de facto guidelines: - CPython is definitely Python; - Jython is surely Python, even if it lacks the byte-code of CPython and some things behave slightly differently; - MicroPython is probably Python, because nobody expects to be able to run Tkinter GUI apps on an embedded device with 256K or RAM; but it's hard to make that judgement except on a case-by-case basis. I think though that even if there's no documented line, most people recognise that there are "core" and "non-core" standard modules. dis and tkinter are non-core: if µPython leaves out tkinter, nobody will be surprised; if Jython leaves out dis, nobody will hold it against them; but if they leave out math or htmllib that's another story. So a headless server can probably leave out tkinter; but a desktop shouldn't. [...]
The other extreme is Javascript/Node.js, where the "just use pip" (or npm in this case) philosophy has been taken to such extremes that one developer practically brought down the entire Node.js ecosystem by withdrawing an eleven line module, left-pad, in a fit of pique.
Being open source, the damage was routed around quite quickly, but still, I think it's a good cautionary example of how a technological advance can transform a programming culture to the worse.
I don't understand the analogy. Should the eleven-line module have been in Node's stdlib? Outside of stdlib, people are doing this.
The point is that Javascript/Node.js is so lacking in batteries that the community culture has gravitated to an extreme version of "just use pip". I'm not suggesting that you, or anyone else, has proposed that Python do the same, only that there's a balance to be found between the extremes of "everything in the Python ecosystem should be part of the standard installation" and "next to nothing should be part of the standard installation". The hard part is deciding where that balance should be :-) -- Steve
On 07/06/2016 05:11 AM, Steven D'Aprano wrote:
On Tue, Jul 05, 2016 at 08:01:43PM +0200, Petr Viktorin wrote:
In the tkinter case, compiling for source is easy on a developer's computer, but doing that on a headless server brings in devel files for the entire graphical environment. Are you saying Python on servers should have a way to do turtle graphics, otherwise it's not Python?
That's a really good question.
I don't think we have an exact answer to "What counts as Python?". It's not like EMCAScript (Javascript) or C where there's a standard that defines the language and standard modules. We just have some de facto guidelines:
- CPython is definitely Python; - Jython is surely Python, even if it lacks the byte-code of CPython and some things behave slightly differently; - MicroPython is probably Python, because nobody expects to be able to run Tkinter GUI apps on an embedded device with 256K or RAM;
but it's hard to make that judgement except on a case-by-case basis.
I think though that even if there's no documented line, most people recognise that there are "core" and "non-core" standard modules. dis and tkinter are non-core: if µPython leaves out tkinter, nobody will be surprised; if Jython leaves out dis, nobody will hold it against them; but if they leave out math or htmllib that's another story.
For MicroPython, I would definitely expect htmllib to be an optional add-on – it's not useful for reading data off a thermometer saving it to an SD card. But I guess that's getting too deep into specifics.
So a headless server can probably leave out tkinter; but a desktop shouldn't.
Up till recently this wasn't possible to express in terms of RPM dependencies. Now, it's on the ever-growing TODO list... Another problem here is that you don't explicitly "install Python" on Fedora: when you install the system, you get a minimal set of packages to make everything work, and most of Python is part of that – but tkinter is not. This is in contrast to python.org releases, where you explicitly ask for (all of) Python. Technically it would now be possible to have to install Python to use it, but we run into another "batteries included" problem: Python (or, "most-of-Python") is a pretty good battery for an OS. Maybe a good short-term solution would be to make "import tkinter" raise ImportError("Run `dnf install tkinter` to install the tkinter module") if not found. This would prevent confusion while keeping the status quo. I'll look into that.
[...]
The other extreme is Javascript/Node.js, where the "just use pip" (or npm in this case) philosophy has been taken to such extremes that one developer practically brought down the entire Node.js ecosystem by withdrawing an eleven line module, left-pad, in a fit of pique.
Being open source, the damage was routed around quite quickly, but still, I think it's a good cautionary example of how a technological advance can transform a programming culture to the worse.
I don't understand the analogy. Should the eleven-line module have been in Node's stdlib? Outside of stdlib, people are doing this.
The point is that Javascript/Node.js is so lacking in batteries that the community culture has gravitated to an extreme version of "just use pip". I'm not suggesting that you, or anyone else, has proposed that Python do the same, only that there's a balance to be found between the extremes of "everything in the Python ecosystem should be part of the standard installation" and "next to nothing should be part of the standard installation".
The hard part is deciding where that balance should be :-)
I think the balance is where it needs to be for CPython, and it's also where it needs to be for Fedora. The real hard part is acknowledging that it needs to be in different places for different use cases, and making sure work to support the different use cases is coordinated. So, I guess I'm starting to form a concrete proposal: 1) Document what should happen when a stdlib module is not available. This should be an ImportError informative error message, something along the lines of 'This build of Python does not include SQLite support.' or 'MicroPython does not support turtle' or 'Use `sudo your-package-manager` install tkinter` to install this module'. 2) Document leaf modules (or "clusters") that can be removed from the stdlib, and their dependencies. Make no guarantees about cross-version compatibility of this metadata. 3) Standardize a way to query which stdlib modules are present (without side effects, i.e. trying an import doesn't count) 4) Adjust pip to ignore installed stdlib modules that are present, so distributions can depend on "statistics" and not "statistics if python_ver<3.4". (statistics is just an example, obviously this would only work for modules added after the PEP). For missing stdlib modules, pip should fail with the message from 1). (Unless the "pip upgrade asynciio" proposal goes through, in which case install the module if it's upgradable). 5) Reserve all stdlib module names on PyPI for backports or non-installable placeholders. 6) To ease smoke-testing behavior on Pythons without all of the stdlib, allow pip to remove leaf stdlib modules from a venv. Looks like it's time for a PEP.
On Wed, Jul 6, 2016 at 7:01 PM, Petr Viktorin <encukou@gmail.com> wrote:
Maybe a good short-term solution would be to make "import tkinter" raise ImportError("Run `dnf install tkinter` to install the tkinter module") if not found. This would prevent confusion while keeping the status quo. I'll look into that.
+1. There's precedent for it; Debian does this: rosuav@sikorsky:~$ python Python 2.7.11+ (default, Jun 2 2016, 19:34:15) [GCC 5.3.1 20160528] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import Tkinter Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/lib-tk/Tkinter.py", line 42, in <module> raise ImportError, str(msg) + ', please install the python-tk package' ImportError: No module named _tkinter, please install the python-tk package
ChrisA
On 6 July 2016 at 10:01, Petr Viktorin <encukou@gmail.com> wrote:
4) Adjust pip to ignore installed stdlib modules that are present, so distributions can depend on "statistics" and not "statistics if python_ver<3.4". (statistics is just an example, obviously this would only work for modules added after the PEP). For missing stdlib modules, pip should fail with the message from 1). (Unless the "pip upgrade asynciio" proposal goes through, in which case install the module if it's upgradable).
A couple of comments here. 1. Projects may still need to depend on "statistics from Python 3.6 or later, but the one in 3.5 isn't good enough". Consider for example unittest, where projects often need the backport unittest2 to get access to features that aren't in older versions. 2. This is easy enough to do if we make stdlib modules publish version metadata. But it does raise the question of what the version of a stdlib module is - probably Python version plus a micro version for interim updates. Also, I have a recollection of pip having problems with some stdlib modules that publish version data right now (wsgiref?) - that should be checked to make sure this approach would work.
Looks like it's time for a PEP.
Probably - in principle, something like this proposal could be workable, it'll be a matter of thrashing out the details (which is something the PEP process is good at). Paul
I think the wsgiref issue was that it wasn't in site-packages and so couldn't be removed or upgraded. Having the dist-info available and putting them in site-packages (or a new directory?) lets us handle querying/replacing/removing using the existing tools we use for distribution. Also, I think the version numbers do need to be independent of Python version in case nothing changes between releases. If you develop something using statistics==3.7, but statistics==3.6 is identical, how do you know you can put the lower constraint? Alternatively, if it's statistics==1.0 in both, people won't assume it has anything to do with the Python version. The tricky part here is when everything is in the one repo and everyone implicitly uses the latest version for development, you get the reproducibility issues Barry mentioned earlier. If they're separate and most people have pinned versions, that goes away. Top-posted from my Windows Phone -----Original Message----- From: "Paul Moore" <p.f.moore@gmail.com> Sent: 7/6/2016 2:32 To: "Petr Viktorin" <encukou@gmail.com> Cc: "Python-Dev" <python-dev@python.org> Subject: Re: [Python-Dev] Breaking up the stdlib (Was: release cadence) On 6 July 2016 at 10:01, Petr Viktorin <encukou@gmail.com> wrote:
4) Adjust pip to ignore installed stdlib modules that are present, so distributions can depend on "statistics" and not "statistics if python_ver<3.4". (statistics is just an example, obviously this would only work for modules added after the PEP). For missing stdlib modules, pip should fail with the message from 1). (Unless the "pip upgrade asynciio" proposal goes through, in which case install the module if it's upgradable).
A couple of comments here. 1. Projects may still need to depend on "statistics from Python 3.6 or later, but the one in 3.5 isn't good enough". Consider for example unittest, where projects often need the backport unittest2 to get access to features that aren't in older versions. 2. This is easy enough to do if we make stdlib modules publish version metadata. But it does raise the question of what the version of a stdlib module is - probably Python version plus a micro version for interim updates. Also, I have a recollection of pip having problems with some stdlib modules that publish version data right now (wsgiref?) - that should be checked to make sure this approach would work.
Looks like it's time for a PEP.
Probably - in principle, something like this proposal could be workable, it'll be a matter of thrashing out the details (which is something the PEP process is good at). Paul _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org
I consider making stdlib modules "optional" like this to be completely separate from making them individually versioned - can't quite tell whether you guys do as well? To everyone: please don't conflate these two discussions. The other is about CPython workflow and this one is about community/user expectations (I have not been proposing to remove stdlib modules at any point). Cheers, Steve Top-posted from my Windows Phone -----Original Message----- From: "Petr Viktorin" <encukou@gmail.com> Sent: 7/6/2016 2:04 To: "Steven D'Aprano" <steve@pearwood.info>; "Python-Dev" <python-dev@python.org> Subject: Re: [Python-Dev] Breaking up the stdlib (Was: release cadence) On 07/06/2016 05:11 AM, Steven D'Aprano wrote:
On Tue, Jul 05, 2016 at 08:01:43PM +0200, Petr Viktorin wrote:
In the tkinter case, compiling for source is easy on a developer's computer, but doing that on a headless server brings in devel files for the entire graphical environment. Are you saying Python on servers should have a way to do turtle graphics, otherwise it's not Python?
That's a really good question.
I don't think we have an exact answer to "What counts as Python?". It's not like EMCAScript (Javascript) or C where there's a standard that defines the language and standard modules. We just have some de facto guidelines:
- CPython is definitely Python; - Jython is surely Python, even if it lacks the byte-code of CPython and some things behave slightly differently; - MicroPython is probably Python, because nobody expects to be able to run Tkinter GUI apps on an embedded device with 256K or RAM;
but it's hard to make that judgement except on a case-by-case basis.
I think though that even if there's no documented line, most people recognise that there are "core" and "non-core" standard modules. dis and tkinter are non-core: if µPython leaves out tkinter, nobody will be surprised; if Jython leaves out dis, nobody will hold it against them; but if they leave out math or htmllib that's another story.
For MicroPython, I would definitely expect htmllib to be an optional add-on – it's not useful for reading data off a thermometer saving it to an SD card. But I guess that's getting too deep into specifics.
So a headless server can probably leave out tkinter; but a desktop shouldn't.
Up till recently this wasn't possible to express in terms of RPM dependencies. Now, it's on the ever-growing TODO list... Another problem here is that you don't explicitly "install Python" on Fedora: when you install the system, you get a minimal set of packages to make everything work, and most of Python is part of that – but tkinter is not. This is in contrast to python.org releases, where you explicitly ask for (all of) Python. Technically it would now be possible to have to install Python to use it, but we run into another "batteries included" problem: Python (or, "most-of-Python") is a pretty good battery for an OS. Maybe a good short-term solution would be to make "import tkinter" raise ImportError("Run `dnf install tkinter` to install the tkinter module") if not found. This would prevent confusion while keeping the status quo. I'll look into that.
[...]
The other extreme is Javascript/Node.js, where the "just use pip" (or npm in this case) philosophy has been taken to such extremes that one developer practically brought down the entire Node.js ecosystem by withdrawing an eleven line module, left-pad, in a fit of pique.
Being open source, the damage was routed around quite quickly, but still, I think it's a good cautionary example of how a technological advance can transform a programming culture to the worse.
I don't understand the analogy. Should the eleven-line module have been in Node's stdlib? Outside of stdlib, people are doing this.
The point is that Javascript/Node.js is so lacking in batteries that the community culture has gravitated to an extreme version of "just use pip". I'm not suggesting that you, or anyone else, has proposed that Python do the same, only that there's a balance to be found between the extremes of "everything in the Python ecosystem should be part of the standard installation" and "next to nothing should be part of the standard installation".
The hard part is deciding where that balance should be :-)
I think the balance is where it needs to be for CPython, and it's also where it needs to be for Fedora. The real hard part is acknowledging that it needs to be in different places for different use cases, and making sure work to support the different use cases is coordinated. So, I guess I'm starting to form a concrete proposal: 1) Document what should happen when a stdlib module is not available. This should be an ImportError informative error message, something along the lines of 'This build of Python does not include SQLite support.' or 'MicroPython does not support turtle' or 'Use `sudo your-package-manager` install tkinter` to install this module'. 2) Document leaf modules (or "clusters") that can be removed from the stdlib, and their dependencies. Make no guarantees about cross-version compatibility of this metadata. 3) Standardize a way to query which stdlib modules are present (without side effects, i.e. trying an import doesn't count) 4) Adjust pip to ignore installed stdlib modules that are present, so distributions can depend on "statistics" and not "statistics if python_ver<3.4". (statistics is just an example, obviously this would only work for modules added after the PEP). For missing stdlib modules, pip should fail with the message from 1). (Unless the "pip upgrade asynciio" proposal goes through, in which case install the module if it's upgradable). 5) Reserve all stdlib module names on PyPI for backports or non-installable placeholders. 6) To ease smoke-testing behavior on Pythons without all of the stdlib, allow pip to remove leaf stdlib modules from a venv. Looks like it's time for a PEP. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org
On 7 July 2016 at 00:27, Steve Dower <steve.dower@python.org> wrote:
I consider making stdlib modules "optional" like this to be completely separate from making them individually versioned - can't quite tell whether you guys do as well?
The point of overlap I see is that if the stdlib starts putting some selected modules into site-packages (so "pip install --upgrade <sublibrary>" works without any changes to pip or equivalent tools), then that also solves the "How to explicitly declare dependencies on particular pieces of the standard library" problem: you use the same mechanisms we already use to declare dependencies on 3rd party packages distributed via a packaging server. I really like the idea of those independently versioned subcomponents of the standard library only being special in the way they're developed (i.e. as part of the standard library) and the way they're published (i.e. included by default with the python.org binary installers and source tarballs), rather than also being special in the way they're managed at runtime. Versioning selected stdlib subsets also has potential to help delineate clear expectations for redistributors: Are you leaving a particular subset out of your default install? Then you should propose that it become an independently versioned subset of the stdlib. Is a given subset already independently versioned in the stdlib? Then it may be OK to leave it out of your default install and document that you've done so.
To everyone: please don't conflate these two discussions. The other is about CPython workflow and this one is about community/user expectations (I have not been proposing to remove stdlib modules at any point).
While I agree they're separate discussions, the workflow management one has the potential to *also* improve the user experience in cases where redistributors are already separating out pieces of the stdlib to make them optional. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 7/6/2016 10:44 PM, Nick Coghlan wrote:
The point of overlap I see is that if the stdlib starts putting some selected modules into site-packages (so "pip install --upgrade <sublibrary>" works without any changes to pip or equivalent tools), then that also solves the "How to explicitly declare dependencies on particular pieces of the standard library" problem: you use the same mechanisms we already use to declare dependencies on 3rd party packages distributed via a packaging server.
One thing to keep in mind if we do this is how it interacts with the -S command line option to not include site-packages in sys.path. I currently use -S to basically mean "give my python as it was distributed, and don't include anything that was subsequently added by adding other RPM's (or package manager of your choice)". I realize that's a rough description, and possibly an abuse of -S. If using -S were to start excluding parts of the stdlib, that would be a problem for me. Eric.
On Jul 07, 2016, at 08:12 AM, Eric V. Smith wrote:
One thing to keep in mind if we do this is how it interacts with the -S command line option to not include site-packages in sys.path. I currently use -S to basically mean "give my python as it was distributed, and don't include anything that was subsequently added by adding other RPM's (or package manager of your choice)". I realize that's a rough description, and possibly an abuse of -S. If using -S were to start excluding parts of the stdlib, that would be a problem for me.
It's an important consideration, and leads to another discussion that's recurred over the years. Operating systems often want an "isolated" Python, similar to what's given by -I, which cannot be altered by subsequent installs. It's one of the things that lead to the Debian ecosystem using dist-packages for PyPI installed packages. Without isolation, it's just too easy for some random PyPI thing to break your system, and yes, that has really happened in the past. So if we go down the path of moving more of the stdlib to site-packages, we also need to get serious about a system-specific isolated Python. Cheers, -Barry
Yes, not too long ago I installed "every" Python package on Ubuntu, and Python basically would not start. Perhaps some plugin system was trying to import everything and caused a segfault in GTK. The "short sys.path" model of everything installed is importable has its limits. On Thu, Jul 7, 2016 at 9:24 AM Barry Warsaw <barry@python.org> wrote:
On Jul 07, 2016, at 08:12 AM, Eric V. Smith wrote:
One thing to keep in mind if we do this is how it interacts with the -S command line option to not include site-packages in sys.path. I currently use -S to basically mean "give my python as it was distributed, and don't include anything that was subsequently added by adding other RPM's (or package manager of your choice)". I realize that's a rough description, and possibly an abuse of -S. If using -S were to start excluding parts of the stdlib, that would be a problem for me.
It's an important consideration, and leads to another discussion that's recurred over the years. Operating systems often want an "isolated" Python, similar to what's given by -I, which cannot be altered by subsequent installs. It's one of the things that lead to the Debian ecosystem using dist-packages for PyPI installed packages. Without isolation, it's just too easy for some random PyPI thing to break your system, and yes, that has really happened in the past.
So if we go down the path of moving more of the stdlib to site-packages, we also need to get serious about a system-specific isolated Python.
Cheers, -Barry _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
On 07Jul2016 0624, Barry Warsaw wrote:
On Jul 07, 2016, at 08:12 AM, Eric V. Smith wrote:
One thing to keep in mind if we do this is how it interacts with the -S command line option to not include site-packages in sys.path. I currently use -S to basically mean "give my python as it was distributed, and don't include anything that was subsequently added by adding other RPM's (or package manager of your choice)". I realize that's a rough description, and possibly an abuse of -S. If using -S were to start excluding parts of the stdlib, that would be a problem for me.
It's an important consideration, and leads to another discussion that's recurred over the years. Operating systems often want an "isolated" Python, similar to what's given by -I, which cannot be altered by subsequent installs. It's one of the things that lead to the Debian ecosystem using dist-packages for PyPI installed packages. Without isolation, it's just too easy for some random PyPI thing to break your system, and yes, that has really happened in the past.
So if we go down the path of moving more of the stdlib to site-packages, we also need to get serious about a system-specific isolated Python.
I've done just enough research to basically decide that putting any of the stdlib in site-packages is infeasible (it'll break virtualenv/venv as well), so don't worry about that. A "dist-packages" equivalent is a possibility, and it may even be possible to manage these packages directly in Lib/, which would result in basically no visible impact for anyone who doesn't care to update individual parts. Cheers, Steve
Cheers, -Barry
On Jul 3, 2016 1:45 PM, "Paul Moore" <p.f.moore@gmail.com> wrote:
[...]
Furthermore, pip/setuptools are just getting to the point of allowing for dependencies conditional on Python version. If independent stdlib releases were introduced, we'd need to implement dependencies based on stdlib version as well - consider depending on a backport of a new module if the user has an older stdlib version that doesn't include it.
Regarding this particular point: right now, yeah, there's an annoying thing where you have to know that a dependency on stdlib/backported library X has to be written as "X >= 1.0 [py_version <= 3.4]" or whatever, and every package with this dependency has to encode some complicated indirect knowledge of what versions of X ship with what versions of python. (And life is even more complicated if you want to support pypy/jython/..., who are generally shipping manually maintained stdlib forks, and whose nominal "python version equivalent" is only an approximation.) In the extreme, one can imagine a module like typing still being distributed as part of the standard python download, BUT not in the stdlib, but rather as a "preinstalled package" in site-packages/ that could then be upgraded normally after install. In addition to whatever maintenance advantages this might (or might not) have, with regards to Paul's concerns this would actually be a huge improvement, since if a package needs typing 1.3 or whatever then they could just declare that, without having to know a priori which versions of python shipped which version. (Note that linux distributions already split up the stdlib into pieces, and you're not guaranteed to have all of it available.) Or if we want to be less aggressive and keep the stdlib monolithic, then it would still be great if there were some .dist-info metadata somewhere that said "this version of the stdlib provides typing 1.3, asyncio 1.4, ...". I haven't thought through all the details of how this would work and how pip could best take advantage, though. -n
As an observer and user— It may be worth asking the Rust team what the main pain points are in coordinating and managing their releases. Some context for those unfamiliar: Rust uses a Chrome- or Firefox-like release train approach, with stable and beta releases every six weeks. Each release cycle includes both the compiler and the standard library. They use feature flags on "nightly" (the master branch) and cut release branches for actually gets shipped in each release. This has the advantage of letting new features and functionality ship whenever they're ready, rather than waiting for Big Bang releases. Because of strong commitments to stability and backwards compatibility as part of that, it hasn't led to any substantial breakage along the way, either. There is also some early discussion of how they might add LTS releases into that mix. The Rust standard library is currently bundled into the same repository as the compiler. Although the stdlib is currently being modularized and somewhat decoupled from the compiler, I don't believe they intend to separate it from the compiler repository or release in that process (not least because there's no need to further speed up their release cadence!). None of that is meant to suggest Python adopt that specific cadence (though I have found it quite nice), but simply to observe that the Rust team might have useful info on upsides, downsides, and particular gotchas as Python considers changing its own release process. Regards, Chris Krycho
On Jul 3, 2016, at 16:22, Brett Cannon <brett@python.org> wrote:
[forking the conversation since the subject has shifted]
On Sun, 3 Jul 2016 at 09:50 Steve Dower <steve.dower@python.org> wrote: Many of our users prefer stability (the sort who plan operating system updates years in advance), but generally I'm in favour of more frequent releases.
So there's our 18 month cadence for feature/minor releases, and then there's the 6 month cadence for bug-fix/micro releases. At the language summit there was the discussion kicked off by Ned about our release schedule and a group of us had a discussion afterward where a more strict release cadence of 12 months with the release date tied to a consistent month -- e.g. September of every year -- instead of our hand-wavy "about 18 months after the last feature release"; people in the discussion seemed to like the 12 months consistency idea. I think making releases on a regular, annual schedule requires simply a decision by us to do it since the time scale we are talking about is still so large it shouldn't impact the workload of RMs & friends that much (I think).
As for upping the bug-fix release cadence, if we can automate that then perhaps we can up the frequency (maybe once every quarter), but I'm not sure what kind of overhead that would add and thus how much would need to be automated to make that release cadence work. Doing this kind of shrunken cadence for bug-fix releases would require the RM & friends to decide what would need to be automated to shrink the release schedule to make it viable (e.g. "if we automated steps N & M of the release process then I would be okay releasing every 3 months instead of 6").
For me, I say we shift to an annual feature release in a specific month every year, and switch to a quarterly bug-fix releases only if we can add zero extra work to RMs & friends.
It will likely require more complex branching though, presumably based on the LTS model everyone else uses.
Why is that? You can almost view our feature releases as LTS releases, at which point our current branching structure is no different.
One thing we've discussed before is separating core and stdlib releases. I'd be really interested to see a release where most of the stdlib is just preinstalled (and upgradeable) PyPI packages. We can pin versions/bundle wheels for stable releases and provide a fast track via pip to update individual packages.
Probably no better opportunity to make such a fundamental change as we move to a new VCS...
<deep breath />
Topic 1 ======= If we separate out the stdlib, we first need to answer why we are doing this? The arguments supporting this idea is (1) it might simplify more frequent releases of Python (but that's a guess), (2) it would make the stdlib less CPython-dependent (if purely by the fact of perception and ease of testing using CI against other interpreters when they have matching version support), and (3) it might make it easier for us to get more contributors who are comfortable helping with just the stdlib vs CPython itself (once again, this might simply be through perception).
So if we really wanted to go this route of breaking out the stdlib, I think we have two options. One is to have the cpython repo represent the CPython interpreter and then have a separate stdlib repo. The other option is to still have cpython represent the interpreter but then each stdlib module have their own repository.
Since the single repo for the stdlib is not that crazy, I'll talk about the crazier N repo idea (in all scenarios we would probably have a repo that pulled in cpython and the stdlib through either git submodules or subtrees and that would represent a CPython release repo). In this scenario, having each module/package have its own repo could get us a couple of things. One is that it might help simplify module maintenance by allowing each module to have its own issue tracker, set of contributors, etc. This also means it will make it obvious what modules are being neglected which will either draw attention and get help or honestly lead to a deprecation if no one is willing to help maintain it.
Separate repos would also allow for easier backport releases (e.g. what asyncio and typing have been doing since they were created). If a module is maintained as if it was its own project then it makes it easier to make releases separated from the stdlib itself (although the usefulness is minimized as long as sys.path has site-packages as its last entry). Separate releases allows for faster releases of the stand-alone module, e.g. if only asyncio has a bug then asyncio can cut their own release and the rest of the stdlib doesn't need to care. Then when a new CPython release is done we can simply bundle up the stable release at the moment and essentially make our mythical sumo release be the stdlib release itself (and this would help stop modules like asyncio and typing from simply copying modules into the stdlib from their external repo if we just pulled in their repo using submodules or subtrees in a master repo).
And yes, I realize this might lead to a ton of repos, but maybe that's an important side effect. We have so much code in our stdlib that it's hard to maintain and fixes can get dropped on the floor. If this causes us to re-prioritize what should be in the stdlib and trim it back to things we consider critical to have in all Python releases, then IMO that's as a huge win in maintainability and workload savings instead of carrying forward neglected code (or at least help people focus on modules they care about and let others know where help is truly needed).
Topic 2 ======= Independent releases of the stdlib could be done, although if we break the stdlib up into individual repos then it shifts the conversation as individual modules could simply do their own releases independent of the big stdlib release. Personally I don't see a point of doing a stdlib release separate from CPython, but I could see doing a more frequent release of CPython where the only thing that changed is the stdlib itself (but I don't know if that would even alleviate the RM workload).
For me, I'm more interested in thinking about breaking the stdlib modules into their own repos and making a CPython release more of a collection of python-dev-approved modules that are maintained under the python organization on GitHub and follow our compatibility guidelines and code quality along with the CPython interpreter. This would also make it much easier for custom distros, e.g. a cloud-targeted CPython release that ignored all GUI libraries.
-Brett
Cheers, Steve
Top-posted from my Windows Phone From: Guido van Rossum Sent: 7/3/2016 7:42 To: Python-Dev Cc: Nick Coghlan Subject: Re: [Python-Dev] Request for CPython 3.5.3 release
Another thought recently occurred to me. Do releases really have to be such big productions? A recent ACM article by Tom Limoncelli[1] reminded me that we're doing releases the old-fashioned way -- infrequently, and with lots of manual labor. Maybe we could (eventually) try to strive for a lighter-weight, more automated release process? It would be less work, and it would reduce stress for authors of stdlib modules and packages -- there's always the next release. I would think this wouldn't obviate the need for carefully planned and timed "big deal" feature releases, but it could make the bug fix releases *less* of a deal, for everyone.
[1] http://cacm.acm.org/magazines/2016/7/204027-the-small-batches-principle/abst... (sadly requires login)
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris%40chriskrycho.com
I actually thought about Rust when thinking about 3 month releases (I know they release faster though). What i would want to know is whether the RMs for Rust are employed by Mozilla and thus have work time to do but it vs Python RMs & friends who vary ob whether they get work time. On Sun, Jul 3, 2016, 13:54 Chris Krycho <chris@chriskrycho.com> wrote:
As an observer and user—
It may be worth asking the Rust team what the main pain points are in coordinating and managing their releases.
Some context for those unfamiliar: Rust uses a Chrome- or Firefox-like release train approach, with stable and beta releases every six weeks. Each release cycle includes both the compiler and the standard library. They use feature flags on "nightly" (the master branch) and cut release branches for actually gets shipped in each release. This has the advantage of letting new features and functionality ship whenever they're ready, rather than waiting for Big Bang releases. Because of strong commitments to stability and backwards compatibility as part of that, it hasn't led to any substantial breakage along the way, either.
There is also some early discussion of how they might add LTS releases into that mix.
The Rust standard library is currently bundled into the same repository as the compiler. Although the stdlib is currently being modularized and somewhat decoupled from the compiler, I don't believe they intend to separate it from the compiler repository or release in that process (not least because there's no need to further speed up their release cadence!).
None of that is meant to suggest Python adopt that specific cadence (though I have found it *quite* nice), but simply to observe that the Rust team might have useful info on upsides, downsides, and particular gotchas as Python considers changing its own release process.
Regards, Chris Krycho
On Jul 3, 2016, at 16:22, Brett Cannon <brett@python.org> wrote:
[forking the conversation since the subject has shifted]
On Sun, 3 Jul 2016 at 09:50 Steve Dower <steve.dower@python.org> wrote:
Many of our users prefer stability (the sort who plan operating system updates years in advance), but generally I'm in favour of more frequent releases.
So there's our 18 month cadence for feature/minor releases, and then there's the 6 month cadence for bug-fix/micro releases. At the language summit there was the discussion kicked off by Ned about our release schedule and a group of us had a discussion afterward where a more strict release cadence of 12 months with the release date tied to a consistent month -- e.g. September of every year -- instead of our hand-wavy "about 18 months after the last feature release"; people in the discussion seemed to like the 12 months consistency idea. I think making releases on a regular, annual schedule requires simply a decision by us to do it since the time scale we are talking about is still so large it shouldn't impact the workload of RMs & friends *that* much (I think).
As for upping the bug-fix release cadence, if we can automate that then perhaps we can up the frequency (maybe once every quarter), but I'm not sure what kind of overhead that would add and thus how much would need to be automated to make that release cadence work. Doing this kind of shrunken cadence for bug-fix releases would require the RM & friends to decide what would need to be automated to shrink the release schedule to make it viable (e.g. "if we automated steps N & M of the release process then I would be okay releasing every 3 months instead of 6").
For me, I say we shift to an annual feature release in a specific month every year, and switch to a quarterly bug-fix releases only if we can add zero extra work to RMs & friends.
It will likely require more complex branching though, presumably based on the LTS model everyone else uses.
Why is that? You can almost view our feature releases as LTS releases, at which point our current branching structure is no different.
One thing we've discussed before is separating core and stdlib releases. I'd be really interested to see a release where most of the stdlib is just preinstalled (and upgradeable) PyPI packages. We can pin versions/bundle wheels for stable releases and provide a fast track via pip to update individual packages.
Probably no better opportunity to make such a fundamental change as we move to a new VCS...
<deep breath />
Topic 1 ======= If we separate out the stdlib, we first need to answer why we are doing this? The arguments supporting this idea is (1) it might simplify more frequent releases of Python (but that's a guess), (2) it would make the stdlib less CPython-dependent (if purely by the fact of perception and ease of testing using CI against other interpreters when they have matching version support), and (3) it might make it easier for us to get more contributors who are comfortable helping with just the stdlib vs CPython itself (once again, this might simply be through perception).
So if we really wanted to go this route of breaking out the stdlib, I think we have two options. One is to have the cpython repo represent the CPython interpreter and then have a separate stdlib repo. The other option is to still have cpython represent the interpreter but then each stdlib module have their own repository.
Since the single repo for the stdlib is not that crazy, I'll talk about the crazier N repo idea (in all scenarios we would probably have a repo that pulled in cpython and the stdlib through either git submodules or subtrees and that would represent a CPython release repo). In this scenario, having each module/package have its own repo could get us a couple of things. One is that it might help simplify module maintenance by allowing each module to have its own issue tracker, set of contributors, etc. This also means it will make it obvious what modules are being neglected which will either draw attention and get help or honestly lead to a deprecation if no one is willing to help maintain it.
Separate repos would also allow for easier backport releases (e.g. what asyncio and typing have been doing since they were created). If a module is maintained as if it was its own project then it makes it easier to make releases separated from the stdlib itself (although the usefulness is minimized as long as sys.path has site-packages as its last entry). Separate releases allows for faster releases of the stand-alone module, e.g. if only asyncio has a bug then asyncio can cut their own release and the rest of the stdlib doesn't need to care. Then when a new CPython release is done we can simply bundle up the stable release at the moment and essentially make our mythical sumo release be the stdlib release itself (and this would help stop modules like asyncio and typing from simply copying modules into the stdlib from their external repo if we just pulled in their repo using submodules or subtrees in a master repo).
And yes, I realize this might lead to a ton of repos, but maybe that's an important side effect. We have so much code in our stdlib that it's hard to maintain and fixes can get dropped on the floor. If this causes us to re-prioritize what should be in the stdlib and trim it back to things we consider critical to have in all Python releases, then IMO that's as a huge win in maintainability and workload savings instead of carrying forward neglected code (or at least help people focus on modules they care about and let others know where help is truly needed).
Topic 2 ======= Independent releases of the stdlib could be done, although if we break the stdlib up into individual repos then it shifts the conversation as individual modules could simply do their own releases independent of the big stdlib release. Personally I don't see a point of doing a stdlib release separate from CPython, but I could see doing a more frequent release of CPython where the only thing that changed is the stdlib itself (but I don't know if that would even alleviate the RM workload).
For me, I'm more interested in thinking about breaking the stdlib modules into their own repos and making a CPython release more of a collection of python-dev-approved modules that are maintained under the python organization on GitHub and follow our compatibility guidelines and code quality along with the CPython interpreter. This would also make it much easier for custom distros, e.g. a cloud-targeted CPython release that ignored all GUI libraries.
-Brett
Cheers, Steve
Top-posted from my Windows Phone ------------------------------ From: Guido van Rossum <guido@python.org> Sent: 7/3/2016 7:42 To: Python-Dev <python-dev@python.org> Cc: Nick Coghlan <ncoghlan@gmail.com> Subject: Re: [Python-Dev] Request for CPython 3.5.3 release
Another thought recently occurred to me. Do releases really have to be such big productions? A recent ACM article by Tom Limoncelli[1] reminded me that we're doing releases the old-fashioned way -- infrequently, and with lots of manual labor. Maybe we could (eventually) try to strive for a lighter-weight, more automated release process? It would be less work, and it would reduce stress for authors of stdlib modules and packages -- there's always the next release. I would think this wouldn't obviate the need for carefully planned and timed "big deal" feature releases, but it could make the bug fix releases *less* of a deal, for everyone.
[1] http://cacm.acm.org/magazines/2016/7/204027-the-small-batches-principle/abst... (sadly requires login)
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris%40chriskrycho.com
On 7/3/2016 4:22 PM, Brett Cannon wrote:
So if we really wanted to go this route of breaking out the stdlib, I think we have two options. One is to have the cpython repo represent the CPython interpreter and then have a separate stdlib repo. The other option is to still have cpython represent the interpreter but then each stdlib module have their own repository.
Option 3 is something in between: groups of stdlib modules in their own repository. An obvious example: a gui group with _tkinter, tkinter, idlelib, turtle, turtledemo, and their doc files. Having 100s of repositories would not work well with with TortoiseHg. -- Terry Jan Reedy
On 03Jul2016 1556, Terry Reedy wrote:
On 7/3/2016 4:22 PM, Brett Cannon wrote:
So if we really wanted to go this route of breaking out the stdlib, I think we have two options. One is to have the cpython repo represent the CPython interpreter and then have a separate stdlib repo. The other option is to still have cpython represent the interpreter but then each stdlib module have their own repository.
Option 3 is something in between: groups of stdlib modules in their own repository. An obvious example: a gui group with _tkinter, tkinter, idlelib, turtle, turtledemo, and their doc files. Having 100s of repositories would not work well with with TortoiseHg.
A rough count of how I'd break up the current 3.5 Lib folder (which I happened to have handy) suggests no more than 50 repos. But there'd be no need to have all of them checked out just to build - only the ones you want to modify. And in that case, you'd probably have a stable Python to work against the separate package repo and wouldn't need to clone the core one. (I'm envisioning a build process that generates wheels from online sources and caches them. So updating the stdlib wheel cache would be part of the build process, but then the local wheels are used to install.) I personally would only have about 5 repos cloned on any of my dev machines (core, ctypes, distutils, possibly tkinter, ssl), as I rarely touch any other packages. (Having those separate from core is mostly for the versioning benefits - I doubt we could ever release Python without them, but it'd be great to be able to update distutils, ctypes or ssl in place with a simple pip/package mgr command.) Cheers, Steve
On Jul 03, 2016, at 04:21 PM, Steve Dower wrote:
A rough count of how I'd break up the current 3.5 Lib folder (which I happened to have handy) suggests no more than 50 repos.
A concern with a highly split stdlib is local testing. I'm not worried about pull request testing, or after-the-fact buildbot testing since I'd have to assume that we'd make sure the fully integrated sumo package was tested in both environments. But what about local testing? Let's say you change something in one module that causes a regression in a different module in a different repo. If you've only got a small subset checked out, you might never notice that before you PR'd your change. And then once the test fails, how easy will it be for you to recreate the tested environment locally so that you could debug your regression? I'm sure it's doable, but let's not lose sight of that if this path is taken. (Personally, I'm +0 on splitting out the stdlib and -1 on micro-splitting it.) Cheers, -Barry
On 05Jul2016 1028, Barry Warsaw wrote:
On Jul 03, 2016, at 04:21 PM, Steve Dower wrote:
A rough count of how I'd break up the current 3.5 Lib folder (which I happened to have handy) suggests no more than 50 repos.
A concern with a highly split stdlib is local testing. I'm not worried about pull request testing, or after-the-fact buildbot testing since I'd have to assume that we'd make sure the fully integrated sumo package was tested in both environments.
But what about local testing? Let's say you change something in one module that causes a regression in a different module in a different repo. If you've only got a small subset checked out, you might never notice that before you PR'd your change. And then once the test fails, how easy will it be for you to recreate the tested environment locally so that you could debug your regression?
I'm sure it's doable, but let's not lose sight of that if this path is taken.
My hope is that it would be essentially a "pip freeze"/"pip install -r ..." (or equivalent with whatever tool is used/created for managing the stdlib). Perhaps using VCS URIs rather than version numbers? That is, the test run would dump a list of exactly which stdlib versions it's using, so that when you review the results it is possible to recreate it. But the point is well taken. I'm very hesitant about splitting out packages that are common dependencies of other parts of the stdlib, but there are plenty of leaf nodes in there too. Creating a complex dependency graph would be a disaster. Cheers, Steve
(Personally, I'm +0 on splitting out the stdlib and -1 on micro-splitting it.)
Cheers, -Barry
On Jul 05, 2016, at 10:38 AM, Steve Dower wrote:
My hope is that it would be essentially a "pip freeze"/"pip install -r ..." (or equivalent with whatever tool is used/created for managing the stdlib). Perhaps using VCS URIs rather than version numbers?
That is, the test run would dump a list of exactly which stdlib versions it's using, so that when you review the results it is possible to recreate it.
I think you'd have to have vcs checkouts though, because you will often need to fix or change something in one of those other library pieces. The other complication of course is that now you'll have two dependent PRs with reviews in two different repos.
But the point is well taken. I'm very hesitant about splitting out packages that are common dependencies of other parts of the stdlib, but there are plenty of leaf nodes in there too. Creating a complex dependency graph would be a disaster.
Yeah. <shudder> Cheers, -Barry
On 6 July 2016 at 03:47, Barry Warsaw <barry@python.org> wrote:
On Jul 05, 2016, at 10:38 AM, Steve Dower wrote:
My hope is that it would be essentially a "pip freeze"/"pip install -r ..." (or equivalent with whatever tool is used/created for managing the stdlib). Perhaps using VCS URIs rather than version numbers?
That is, the test run would dump a list of exactly which stdlib versions it's using, so that when you review the results it is possible to recreate it.
I think you'd have to have vcs checkouts though, because you will often need to fix or change something in one of those other library pieces. The other complication of course is that now you'll have two dependent PRs with reviews in two different repos.
I'd actually advocate for keeping a unified clone, and make any use of pip to manage pieces of the standard library purely an install-time thing (as it is for ensurepip). The main problem I see with actually making stdlib development dependent on having a venv already set up for pip to do its thing without affecting the rest of your system is that it would pose a major bootstrapping problem. It does mean we'd be introducing a greater divergence between the way devs work locally and the way the CI system worked (as in this model we'd definitely want the buildbots to be exercising both the "test in checkout" and "test an installed version" case), but working across multiple repos would be worse. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 4 July 2016 at 06:22, Brett Cannon <brett@python.org> wrote:
[forking the conversation since the subject has shifted]
On Sun, 3 Jul 2016 at 09:50 Steve Dower <steve.dower@python.org> wrote:
Many of our users prefer stability (the sort who plan operating system updates years in advance), but generally I'm in favour of more frequent releases.
So there's our 18 month cadence for feature/minor releases, and then there's the 6 month cadence for bug-fix/micro releases. At the language summit there was the discussion kicked off by Ned about our release schedule and a group of us had a discussion afterward where a more strict release cadence of 12 months with the release date tied to a consistent month -- e.g. September of every year -- instead of our hand-wavy "about 18 months after the last feature release"; people in the discussion seemed to like the 12 months consistency idea.
While we liked the "consistent calendar cadence that is some multiple of 6 months" idea, several of us thought 12 months was way too short as it makes for too many entries in third party support matrices. I'd also encourage folks to go back and read the two PEPs that were written the last time we had a serious discussion about changing the release cadence, since many of the concerns raised then remain relevant today: * PEP 407 (faster cycle with LTS releases): https://www.python.org/dev/peps/pep-0407/ * PEP 413 (separate stdlib versioning): https://www.python.org/dev/peps/pep-0413/ In particular, the "unsustainable community support matrix" problem I describe in PEP 413 is still a major point of concern for me - we know from PyPI's download metrics that Python 2.6 is still in heavy use, so many folks have only started to bump their "oldest supported version" up to Python 2.7 in the last year or so (5+ years after it was released). People have been a bit more aggressive in dropping compatibility with older Python 3 versions, but it's also been the case that availability and adoption of LTS versions of Python 3 has been limited to date (mainly just the 5 years for 3.2 in Ubuntu 12.04 and 3.4 in Ubuntu 14.04 - the longest support lifecycle I'm aware of after that is Red Hat's 3 years for Red Hat Software Collections). The reason I withdrew PEP 413 as a prospective answer to that problem is that I think there's generally only a limited number of libraries that are affected by the challenge of sometimes getting too old to be useful to cross-platform library and framework developers (mostly network protocol and file format related, but also the ones still marked as provisional), and the introduction of ensurepip gives us a new way of dealing with them: treating *those particular libraries* as independently upgradable bundled libraries where the CPython build process creates wheel files for them, and then uses ensurepip's internally bundled pip to install those wheels at install time, even if pip itself is left out of the installation. In the specific case that prompted this thread for example, I don't think the problem is really that the standard library release cadence is too slow in general: it's that "pip install --upgrade asyncio" *isn't an available option* in Python 3.5, even if you're using a virtual environment. For other standard library modules, we've tackled that by letting people do things like "pip install contextlib2" to get the newer versions, even on older Python releases - individual projects are then responsible for opting in to using either the stdlib version or the potentially-newer backported version. However, aside from the special case of ensurepip, what we've yet to do is ever make a standard library package *itself* independently upgradable (such that the Python version merely implies a *minimum* version of that library, rather than an exact version). Since it has core developers actively involved in its development, and already provides a PyPI package for the sake of Python 3.3 users, perhaps "asyncio" could make a good guinea pig for designing such a bundling process? Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Jul 04, 2016, at 10:31 AM, Nick Coghlan wrote:
While we liked the "consistent calendar cadence that is some multiple of 6 months" idea, several of us thought 12 months was way too short as it makes for too many entries in third party support matrices.
18 months for a major release cadence still seems right to me. Downstreams and third-parties often have to go through *a lot* of work to ensure compatibility, and try as we might, every Python release breaks *something*. Major version releases trigger a huge cascade of other work for lots of other people, and I don't think shortening that would be for the overall community good. It just feels like we'd always be playing catch up. Different downstreams have different cadences. I can only speak for Debian, which has a release-when-ready policy, and Ubuntu, which has strictly timed releases. When the Python release aligns nicely with Ubuntu's LTS releases, we can usually manage the transition fairly well because we can allocate resource way ahead of time. (I'm less concerned about Ubuntu's mid-LTS 6 month releases.) For example, 3.6 final will come out in December 2016, so it'll be past our current 16.10 Ubuntu release. We've pretty much decided to carry Python 3.5 through until 17.04, and that'll give us a good year to make 18.04 LTS have a solid Python 3.6 ecosystem. Projecting ahead, it probably means 3.7 in mid-2018, which is after the Ubuntu 18.04 LTS release, so we'll only do one major transition before the next LTS. From my perspective, that feels about right. Cheers, -Barry
On 07/05/2016 10:44 AM, Barry Warsaw wrote:
On Jul 04, 2016, at 10:31 AM, Nick Coghlan wrote:
While we liked the "consistent calendar cadence that is some multiple of 6 months" idea, several of us thought 12 months was way too short as it makes for too many entries in third party support matrices.
18 months for a major release cadence still seems right to me. Downstreams and third-parties often have to go through *a lot* of work to ensure compatibility, and try as we might, every Python release breaks *something*. Major version releases trigger a huge cascade of other work for lots of other people, and I don't think shortening that would be for the overall community good. It just feels like we'd always be playing catch up.
+1 from me as well. Rapid major releases are just a huge headache. The nice thing about a .6 or .7 minor release is that we get closer to no bugs with each one. -- ~Ethan~
On Tue, 5 Jul 2016 at 10:45 Barry Warsaw <barry@python.org> wrote:
On Jul 04, 2016, at 10:31 AM, Nick Coghlan wrote:
While we liked the "consistent calendar cadence that is some multiple of 6 months" idea, several of us thought 12 months was way too short as it makes for too many entries in third party support matrices.
18 months for a major release cadence still seems right to me. Downstreams and third-parties often have to go through *a lot* of work to ensure compatibility, and try as we might, every Python release breaks *something*. Major version releases trigger a huge cascade of other work for lots of other people, and I don't think shortening that would be for the overall community good. It just feels like we'd always be playing catch up.
Sticking w/ 18 months is also fine, but then I would like to discuss choosing what months we try to release to get into a date-based release cadence so we know that every e.g. December and June are when releases typically happen thanks to our 6 month bug-fix release cadence. This has the nice benefit of all of us being generally aware of when a bug-fix release is coming up instead of having to check the PEP or go through our mail archive to find out what month a bug-fix is going to get cut (and also something the community to basically count on).
On 6 July 2016 at 05:11, Brett Cannon <brett@python.org> wrote:
Sticking w/ 18 months is also fine, but then I would like to discuss choosing what months we try to release to get into a date-based release cadence so we know that every e.g. December and June are when releases typically happen thanks to our 6 month bug-fix release cadence. This has the nice benefit of all of us being generally aware of when a bug-fix release is coming up instead of having to check the PEP or go through our mail archive to find out what month a bug-fix is going to get cut (and also something the community to basically count on).
I don't have a strong preference on that front, as even the worst case outcome of a schedule misalignment for Fedora is what's happening for Fedora 25 & 26: F25 in November will still have Python 3.5, while Rawhide will get the 3.6 beta in September or so and then F26 will be released with 3.6 in the first half of next year. So really, I think the main criterion here is "Whatever works best for the folks directly involved in release management" However, if we did decide we wanted to take minimising "time to redistribution" for at least Ubuntu & Fedora into account, then the two main points to consider would be: - starting the upstream beta phase before the first downstream alpha freeze - publishing the upstream final release before the last downstream beta freeze Assuming 6 month distro cadences, and taking the F25 and 16.10 release cycles as representative, we get: - Ubuntu alpha 1 releases in late January & June - Fedora alpha freezes in early February & August - Ubuntu final beta freezes in late March & September - Fedora beta freezes in late March & September Further assuming we stuck with the current model of ~3 months from beta 1 -> final release, that would suggest a cadence alternating between: * December beta, February release * May beta, August release If we did that, then 3.6 -> 3.7 would be another "short" cycle (15 months from Dec 2016 to Feb 2018) before settling into a regular cadence of: * 2017-12: 3.7.0b1 * 2018-02: 3.7.0 * 2019-05: 3.8.0b1 * 2019-08: 3.8.0 * 2020-12: 3.9.0b1 * 2021-02: 3.9.0 * 2022-05: 3.10.0b1 * 2022-08: 3.10.0 * etc... The precise timing of maintenance releases isn't as big a deal (since they don't require anywhere near as much downstream coordination), but offsetting them by a month from the feature releases (so March & September in the Fedbuntu driven proposal above) would allow for the X.(Y+1).1 release to go out at the same time as the final X.Y.Z release. I'll reiterate though that we should be able to adjust to *any* consistent 18 month cycle downstream - the only difference will be the typical latency between new versions being released on python.org, and them showing up in Linux distros as the system Python installation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Jul 06, 2016, at 10:55 AM, Nick Coghlan wrote:
However, if we did decide we wanted to take minimising "time to redistribution" for at least Ubuntu & Fedora into account, then the two main points to consider would be:
- starting the upstream beta phase before the first downstream alpha freeze - publishing the upstream final release before the last downstream beta freeze
There have been cases in the past where the schedules didn't align perfectly, and we really wanted to get ahead of the game, so we released with a late beta, and then got SRUs (stable release upgrade approval) to move to the final release *after* the Ubuntu final release. This isn't great though, especially for non-LTS releases because they have shorter lifecycles and no point releases. Cheers, -Barry
On 6 July 2016 at 11:09, Barry Warsaw <barry@python.org> wrote:
On Jul 06, 2016, at 10:55 AM, Nick Coghlan wrote:
However, if we did decide we wanted to take minimising "time to redistribution" for at least Ubuntu & Fedora into account, then the two main points to consider would be:
- starting the upstream beta phase before the first downstream alpha freeze - publishing the upstream final release before the last downstream beta freeze
There have been cases in the past where the schedules didn't align perfectly, and we really wanted to get ahead of the game, so we released with a late beta, and then got SRUs (stable release upgrade approval) to move to the final release *after* the Ubuntu final release. This isn't great though, especially for non-LTS releases because they have shorter lifecycles and no point releases.
Aye, Petr and I actually discussed doing something like that in order to get Python 3.6 into F25, but eventually decided it would be better to just wait the extra 6 months. We may end up creating a Python 3.6 COPR for F24 & 25 though, similar to the one Matej Stuchlik created for F23 when Python 3.5 didn't quite make the release deadlines. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 05.07.2016 21:11, Brett Cannon wrote:
On Tue, 5 Jul 2016 at 10:45 Barry Warsaw <barry@python.org> wrote:
On Jul 04, 2016, at 10:31 AM, Nick Coghlan wrote:
While we liked the "consistent calendar cadence that is some multiple of 6 months" idea, several of us thought 12 months was way too short as it makes for too many entries in third party support matrices.
18 months for a major release cadence still seems right to me. Downstreams and third-parties often have to go through *a lot* of work to ensure compatibility, and try as we might, every Python release breaks *something*. Major version releases trigger a huge cascade of other work for lots of other people, and I don't think shortening that would be for the overall community good. It just feels like we'd always be playing catch up.
Sticking w/ 18 months is also fine, but then I would like to discuss choosing what months we try to release to get into a date-based release cadence so we know that every e.g. December and June are when releases typically happen thanks to our 6 month bug-fix release cadence. This has the nice benefit of all of us being generally aware of when a bug-fix release is coming up instead of having to check the PEP or go through our mail archive to find out what month a bug-fix is going to get cut (and also something the community to basically count on).
I like the 18 months cycle, because it's a multiple of six, which fits the Ubuntu release cadence (as as I understand, the Fedora cadence as well). Sometimes it might be ambitious to update reverse dependencies in the distro within two months until the distro freeze, and two more months during the freeze leading to a distro release, but such is life, and it's then up to distro maintainers of LTS releases to prepare the distro for a new version with only four months left. My hope with time based releases is that also upstreams will start testing with new versions more early when they can anticipate the release date. Matthias
On 6 July 2016 at 03:44, Barry Warsaw <barry@python.org> wrote:
For example, 3.6 final will come out in December 2016, so it'll be past our current 16.10 Ubuntu release. We've pretty much decided to carry Python 3.5 through until 17.04, and that'll give us a good year to make 18.04 LTS have a solid Python 3.6 ecosystem.
This aligns pretty well with Fedora's plans - the typical Fedora release dates are May & November, so we will stick with 3.5 for this year's F25 release, while the Fedora 26 Rawhide branch is expected to switch to 3.6 shortly after the first 3.6 beta is released in September. The results in Rawhide should thus help with upstream 3.6 beta testing, with the full release of F26 happening in May 2017 or so.
Projecting ahead, it probably means 3.7 in mid-2018, which is after the Ubuntu 18.04 LTS release, so we'll only do one major transition before the next LTS. From my perspective, that feels about right.
Likewise - 24 months is a bit too slow in getting features out, 12 months expands the community version support matrix too much, while 18 months means that even folks supporting 5* year old LTS Linux releases will typically only be a couple of releases behind the latest version. Cheers, Nick. * For folks that don't closely follow the way enterprise Linux distros work, the '5' there isn't arbitrary - it's the lifecycle of Ubuntu LTS releases, and roughly the length of the "Production 1" phase of RHEL releases (where new features may still be added in point releases). Beyond the 5 year mark, I don't think it's particularly reasonable for people to expect free community support, as even Red Hat stops backporting anything other than bug fixes and hardware driver updates at that point. Regardless of your choice of LTS platform, newer versions will be available by the time your current one is that old, so "I don't want to upgrade" is a privilege people can reasonably be expected to pay for. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Jul 06, 2016, at 10:02 AM, Nick Coghlan wrote:
On 6 July 2016 at 03:44, Barry Warsaw <barry@python.org> wrote:
Projecting ahead, it probably means 3.7 in mid-2018, which is after the Ubuntu 18.04 LTS release, so we'll only do one major transition before the next LTS. From my perspective, that feels about right.
Likewise - 24 months is a bit too slow in getting features out, 12 months expands the community version support matrix too much, while 18 months means that even folks supporting 5* year old LTS Linux releases will typically only be a couple of releases behind the latest version.
Cool. Not that there aren't other distros and OSes involved, but having at least this much alignment is a good sign. I should also note that while Debian has a release-when-ready approach, Python 3.6 alpha 2-ish is available in Debian experimental for those who like the bleeding edge. Cheers, -Barry
On 03.07.2016 16:39, Guido van Rossum wrote:
Another thought recently occurred to me. Do releases really have to be such big productions? A recent ACM article by Tom Limoncelli[1] reminded me that we're doing releases the old-fashioned way -- infrequently, and with lots of manual labor. Maybe we could (eventually) try to strive for a lighter-weight, more automated release process?
I can only recommend such an approach. We use it internally for years now and the workload for releasing, quality assurance and final deployment dropped significantly. We basically automated everything. The devs are pretty happy with it now and sometimes "mis-use" it for some of its side-products; but that's okay as it's very convenient to use. For some parts we use pip to install/upgrade the dependencies but CPython might need to use a different tooling for the stdlib and its C-dependencies. If you need some assistance here, let me know.
It would be less work, and it would reduce stress for authors of stdlib modules and packages -- there's always the next release. I would think this wouldn't obviate the need for carefully planned and timed "big deal" feature releases, but it could make the bug fix releases *less* of a deal, for everyone.
[1] http://cacm.acm.org/magazines/2016/7/204027-the-small-batches-principle/abst... (sadly requires login)
Best, Sven
On 7/4/16, 3:32 AM, "Python-Dev on behalf of Sven R. Kunze" <python-dev-bounces+kevin-lists=theolliviers.com@python.org on behalf of srkunze@mail.de> wrote:
On 03.07.2016 16:39, Guido van Rossum wrote:
Another thought recently occurred to me. Do releases really have to be such big productions? A recent ACM article by Tom Limoncelli[1] reminded me that we're doing releases the old-fashioned way -- infrequently, and with lots of manual labor. Maybe we could (eventually) try to strive for a lighter-weight, more automated release process?
I can only recommend such an approach. We use it internally for years now and the workload for releasing, quality assurance and final deployment dropped significantly. We basically automated everything. The devs are pretty happy with it now and sometimes "mis-use" it for some of its side-products; but that's okay as it's very convenient to use.
For some parts we use pip to install/upgrade the dependencies but CPython might need to use a different tooling for the stdlib and its C-dependencies.
If you need some assistance here, let me know.
I also offer my help with setting up CI and automated builds. :) I've actually done build automation for a number of the projects I've worked on in the past. In every case, doing so gave benefits that far outweighed the work needed to get it going. Regards, Kevin
It would be less work, and it would reduce stress for authors of stdlib modules and packages -- there's always the next release. I would think this wouldn't obviate the need for carefully planned and timed "big deal" feature releases, but it could make the bug fix releases *less* of a deal, for everyone.
[1] http://cacm.acm.org/magazines/2016/7/204027-the-small-batches-principle/abst... (sadly requires login)
Best, Sven _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/kevin-lists%40theollivier...
On 04Jul2016 0822, Kevin Ollivier wrote:
On 7/4/16, 3:32 AM, "Python-Dev on behalf of Sven R. Kunze" <python-dev-bounces+kevin-lists=theolliviers.com@python.org on behalf of srkunze@mail.de> wrote:
If you need some assistance here, let me know.
I also offer my help with setting up CI and automated builds. :) I've actually done build automation for a number of the projects I've worked on in the past. In every case, doing so gave benefits that far outweighed the work needed to get it going.
It's actually not that much effort - we already have a fleet of buildbots that automatically build, test and report on Python's stability on a range of platforms. Once a build machine is configured, producing a build is typically a single command. The benefit we get from the heavyweight release procedures is that someone trustworthy (the Release Manager) has controlled the process, reducing the rate of change and ensuring stability over the end of the process. Also that trustworthy people (the build managers) have downloaded, built and signed the code without modifying it or injecting unauthorised code. As a result of these, people trust official releases to be correct and stable. It's very hard to put the same trust in an automated system (and it's a great way to lose signing certificates). I don't believe the release procedures are too onerous (though Benjamin, Larry and Ned are welcome to disagree :) ), and possibly there is some more scripting that could help out, but there's really nothing in the direct process that we need to do more releases. More frequent releases would mean more frequent feature freezes and more time in "cherry-picking" mode (where the RM has to approve and merge each individual fix), which affects all contributors. Shorter cycles make it harder to get changes reviewed, merged and tested. This is the limiting factor. So don't worry about offering skills/effort for CI systems (unless you want to maintain a few buildbots, in which case read https://www.python.org/dev/buildbot/) - go and help review and improve some patches instead. The shorter the cycle between finding a need and committing the patch, and the more often issues are found *before* commit, the more frequently we can do releases. Cheers, Steve
I should quickly mention that future workflow-related stuff in regards to https://www.python.org/dev/peps/pep-0512 and the move to GitHub (e.g. CI), happens on the core-workflow mailing list. On Mon, 4 Jul 2016 at 15:35 Steve Dower <steve.dower@python.org> wrote:
On 04Jul2016 0822, Kevin Ollivier wrote:
On 7/4/16, 3:32 AM, "Python-Dev on behalf of Sven R. Kunze" <python-dev-bounces+kevin-lists=theolliviers.com@python.org on behalf of srkunze@mail.de> wrote:
If you need some assistance here, let me know.
I also offer my help with setting up CI and automated builds. :) I've actually done build automation for a number of the projects I've worked on in the past. In every case, doing so gave benefits that far outweighed the work needed to get it going.
It's actually not that much effort - we already have a fleet of buildbots that automatically build, test and report on Python's stability on a range of platforms. Once a build machine is configured, producing a build is typically a single command.
The benefit we get from the heavyweight release procedures is that someone trustworthy (the Release Manager) has controlled the process, reducing the rate of change and ensuring stability over the end of the process. Also that trustworthy people (the build managers) have downloaded, built and signed the code without modifying it or injecting unauthorised code.
As a result of these, people trust official releases to be correct and stable. It's very hard to put the same trust in an automated system (and it's a great way to lose signing certificates).
I don't believe the release procedures are too onerous (though Benjamin, Larry and Ned are welcome to disagree :) ), and possibly there is some more scripting that could help out, but there's really nothing in the direct process that we need to do more releases.
More frequent releases would mean more frequent feature freezes and more time in "cherry-picking" mode (where the RM has to approve and merge each individual fix), which affects all contributors. Shorter cycles make it harder to get changes reviewed, merged and tested. This is the limiting factor.
So don't worry about offering skills/effort for CI systems (unless you want to maintain a few buildbots, in which case read https://www.python.org/dev/buildbot/) - go and help review and improve some patches instead. The shorter the cycle between finding a need and committing the patch, and the more often issues are found *before* commit, the more frequently we can do releases.
Cheers, Steve
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org
On 07/03/2016 09:39 AM, Guido van Rossum wrote:
Do releases really have to be such big productions? A recent ACM article by Tom Limoncelli[1] reminded me that we're doing releases the old-fashioned way -- infrequently, and with lots of manual labor. Maybe we could (eventually) try to strive for a lighter-weight, more automated release process?
Glyph suggested this as part of his presentation at the 2015 Language Summit: https://lwn.net/Articles/640181/ I won't summarize his comments here, as Jake already did that for us ;-) //arry/
On 03.07.2016 06:09, Nick Coghlan wrote:
On 2 July 2016 at 16:17, Ludovic Gasc <gmludo@gmail.com> wrote:
Hi everybody,
I fully understand that AsyncIO is a drop in the ocean of CPython, you're working to prepare the entire 3.5.3 release for December, not yet ready. However, you might create a 3.5.2.1 release with only this AsyncIO fix ?
That would be more work than just doing a 3.5.3 release, though - the problem isn't with the version number bump, it's with asking the release team to do additional work without clearly explaining the rationale for the request (more on that below). While some parts of the release process are automated, there's still a lot of steps to run through by a number of different people: https://www.python.org/dev/peps/pep-0101/.
The first key question to answer in this kind of situation is: "Is there code that will run correctly on 3.5.1 that will now fail on 3.5.2?" (i.e. it's a regression introduced by the asyncio and coroutine changes in the point release rather than something that was already broken in 3.5.0 and 3.5.1).
I don't know about 3.5.1 exactly, but at least things worked on the branch early April, which were broken by 3.5.2 final. I was trying to prepare for an update to 3.5.2 for Ubuntu 16.04 LTS, and found some regressions, documented at https://launchpad.net/bugs/1586673 (comment #6). It looks like that at least the packages nuitka, python-websockets and urwid fail to build with the 3.5.2 release. Still need to investigate. Unless I'm missing things, there is unfortunately no issue in the Python bug tracker, and there is no patch for the 3.5 branch either. My understanding is that it's not yet decided what to do about the issue. Matthias
On Jul 03, 2016, at 01:17 AM, Ludovic Gasc wrote:
If 3.5.2.1 or 3.5.3 are impossible to release before december, what are the alternative solutions for AsyncIO users ? 1. Use 3.5.1 and hope that Linux distributions won't use 3.5.2 ?
Matthias just uploaded a 3.5.2-2 to Debian unstable, which will also soon make its way to Ubuntu 16.10: https://launchpad.net/ubuntu/+source/python3.5/3.5.2-2 Ubuntu 16.04 LTS currently still has 3.5.1. Cheers, -Barry
participants (23)
-
Barry Warsaw
-
Brett Cannon
-
Chris Angelico
-
Chris Krycho
-
Daniel Holth
-
Eric V. Smith
-
Ethan Furman
-
Guido van Rossum
-
Kevin Ollivier
-
Larry Hastings
-
Ludovic Gasc
-
Matthias Klose
-
Nathaniel Smith
-
Nick Coghlan
-
Paul Moore
-
Petr Viktorin
-
Raymond Hettinger
-
Steve Dower
-
Steven D'Aprano
-
Sven R. Kunze
-
Terry Reedy
-
Victor Stinner
-
Yury Selivanov