Most 3.x buildbots are green again, please don't break them and watch them!
Hi, Last months, most 3.x buildbots failed randomly. Some of them were always failing. I spent some time to fix almost all Windows and Linux buildbots. There were a lot of different issues. So please try to not break buildbots again and remind to watch them sometimes: http://buildbot.python.org/all/waterfall?category=3.x.stable&category=3.x.unstable Next weeks, I will try to backport some fixes to Python 3.5 (if needed) to make these buildbots more stable too. Python 2.7 buildbots are also in a sad state (ex: test_marshal segfaults on Windows, see issue #25264). But it's not easy to get a Windows with the right compiler to develop on Python 2.7 on Windows. -- Maybe it's time to move more 3.x buildbots to the "stable" category? http://buildbot.python.org/all/waterfall?category=3.x.stable By the way, I don't understand why "AMD64 OpenIndiana 3.x" is considered as stable since it's failing with multiple issues since many months and nobody is working on these failures. I suggest to move this buildbot back to the unstable category. -- We have many offline buildbots. What's the status of these buildbots? Should we expect that they come back soon? Or would it be possible to hide them? It would help to check the status of all buildbots. -- Failing buildbots: - AMD64 FreeBSD CURRENT 3.x: http://bugs.python.org/issue26566 -- I installed a fresh FreeBSD CURRENT in a VM and I'm unable to reproduce failures. Maybe the buildbot slave is oudated and FreeBSD must be upgraded? - AMD64 OpenIndiana 3.x, x86 OpenIndiana 3.x: test_socket failures on sendfile. Sorry but I'm not really interested by this OS. - PPC64 AIX 3.x: failing tests: test_httplib, test_httpservers, test_socket, test_distutils, test_asyncio, (...); random timeout failure in test_eintr, etc. I don't have access to AIX and I'm not interested to acquire an AIX license, nor to install it. I'm not sure that it's useful to have an AIX buildbot and no core developer have access to AIX, and nobody is working on AIX failures. Maybe HP wants to help us to support AIX? (Provide manpower, access to AIX servers, or something like that.) - x86 OpenBSD 3.x: 5 tests failed, test_crypt test_socket test_ssl test_strptime test_time. This OS needs some love ;-) - the 4 ICC buildbots are failing with stack overflow, segfault, etc. Again, I'm not sure that these buildbots are useful since it looks like we don't support this compiler yet. Or does it help to work on supporting this compiler? Who is working on ICC support? -- FYI I also made some enhancements on regrtest (our test runner for the test suite), mostly to debug failures: - display the duration of tests taking longer than 30 seconds - new timestamp prefix, used to debug buildbot hangs - when parallel tests are interrupted, display progress on waiting for completion - add timeout to main process when using -jN: it should help to debug buildbot hang - "Run tests in parallel using 3 child processes" or "Run tests sequentially" message which helps to understand how tests are running. There is the -j1 trap which has no effect: tests are still run sequentially. By the way, I proposed to really use subprocesses when -j1 is used: http://bugs.python.org/issue25285 The default timeout changed from 1 hour to 15 min, it's the maximum duration to run a single test file (ex: test_os.py). On my Linux box, running the whole test suite in parallel (10 child processes for my 4 CPU cores with hyperthreading) with Python compiled in debug mode (slow) takes 4 min 37 sec. Tell me if the default timeout is too low. It can be configured per buildbot if needed (TESTTIMEOUT env var). -- By the way, I'm always surprised by the huge difference of time needed to run a build on the different slaves: from a few minutes to more than 3 hours. The fatest Windows slave takes 28 minutes (run tests in parallel using 4 child processes), whereas the 3 others (run tests sequentially and) take between 2 hours and more than 3 hours! Why running tests on Windows takes so long? Maybe we should make sure that no buildbot run tests sequentially, because it creates a lot of annoying side effects (even if sometimes it helps to find tricky bugs, sometimes bugs restricted to the tests themself) and because a lot of time simply wait a few seconds. So running mutliple tests in parallel don't burn your CPU, it's just faster. IMHO the risk of random timeout failures is low compared to the speedup. -- The most interesting bug was a deadlock in locale.setlocale() on Windows 7: the bug made the buildbot to hang "sometimes" (randomly). Jeremy Kloth identified the bug, but Steve Dower noticed us that it's already fixed in Visual Studio 2015 Update 1: so please update VS if it's not the case yet. Steve added a post-build test to check if the ucrtbase/ucrtbased DLL has the known bug. => http://bugs.python.org/issue26624 Victor
On Wed, Apr 13, 2016 at 9:40 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
Maybe it's time to move more 3.x buildbots to the "stable" category? http://buildbot.python.org/all/waterfall?category=3.x.stable
Move the Bruces into stable, perhaps? The AMD64 Debian Root one. Been fairly consistently green. ChrisA
On 4/13/2016 7:40 AM, Victor Stinner wrote:
Last months, most 3.x buildbots failed randomly. Some of them were always failing. I spent some time to fix almost all Windows and Linux buildbots. There were a lot of different issues.
Thanks for all of your work on this, Victor. It's much appreciated. Eric.
On 13/04/2016 12:40, Victor Stinner wrote:
Last months, most 3.x buildbots failed randomly. Some of them were always failing. I spent some time to fix almost all Windows and Linux buildbots. There were a lot of different issues.
Can I state the obvious and offer a huge vote of thanks for this work, which is often tedious and unrewarding? Thank you TJG
On Wed, 13 Apr 2016 at 05:57 Tim Golden <mail@timgolden.me.uk> wrote:
On 13/04/2016 12:40, Victor Stinner wrote:
Last months, most 3.x buildbots failed randomly. Some of them were always failing. I spent some time to fix almost all Windows and Linux buildbots. There were a lot of different issues.
Can I state the obvious and offer a huge vote of thanks for this work, which is often tedious and unrewarding?
Yep, big thanks from me as well!
Victor Stinner <victor.stinner <at> gmail.com> writes:
Maybe it's time to move more 3.x buildbots to the "stable" category? http://buildbot.python.org/all/waterfall?category=3.x.stable
+1 I think anything that is actually stable should be in that category.
By the way, I don't understand why "AMD64 OpenIndiana 3.x" is considered as stable since it's failing with multiple issues since many months and nobody is working on these failures. I suggest to move this buildbot back to the unstable category.
+1 The bot was very stable and fast for some time but has been unstable for at least a year.
- PPC64 AIX 3.x: failing tests: test_httplib, test_httpservers, test_socket, test_distutils, test_asyncio, (...); random timeout failure in test_eintr, etc. I don't have access to AIX and I'm not interested to acquire an AIX license, nor to install it. I'm not sure that it's useful to have an AIX buildbot and no core developer have access to AIX, and nobody is working on AIX failures. Maybe HP wants to help us to support AIX? (Provide manpower, access to AIX servers, or something like that.)
Well, I think in this case it's the gcc AIX maintainer running it, so... I think we should have a policy to stop reporting issues on unstable bots unless someone has a concrete fix OR the bot maintainers are known to fix issues fast (but that does not seem to be the case). Stefan Krah
On Wed, 13 Apr 2016 at 06:14 Stefan Krah <stefan@bytereef.org> wrote:
Victor Stinner <victor.stinner <at> gmail.com> writes:
Maybe it's time to move more 3.x buildbots to the "stable" category? http://buildbot.python.org/all/waterfall?category=3.x.stable
+1 I think anything that is actually stable should be in that category.
By the way, I don't understand why "AMD64 OpenIndiana 3.x" is considered as stable since it's failing with multiple issues since many months and nobody is working on these failures. I suggest to move this buildbot back to the unstable category.
+1 The bot was very stable and fast for some time but has been unstable for at least a year.
- PPC64 AIX 3.x: failing tests: test_httplib, test_httpservers, test_socket, test_distutils, test_asyncio, (...); random timeout failure in test_eintr, etc. I don't have access to AIX and I'm not interested to acquire an AIX license, nor to install it. I'm not sure that it's useful to have an AIX buildbot and no core developer have access to AIX, and nobody is working on AIX failures. Maybe HP wants to help us to support AIX? (Provide manpower, access to AIX servers, or something like that.)
Well, I think in this case it's the gcc AIX maintainer running it, so...
I think we should have a policy to stop reporting issues on unstable bots unless someone has a concrete fix OR the bot maintainers are known to fix issues fast (but that does not seem to be the case).
Official policy per https://www.python.org/dev/peps/pep-0011/#supporting-platforms states that there must be a core developer to maintain the compatibility, so if there's no one helping to keep a particular buildbot green then I agree it should be marked as unstable and thus not supported.
On Wed, Apr 13, 2016 at 6:40 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Hi,
Last months, most 3.x buildbots failed randomly. Some of them were always failing. I spent some time to fix almost all Windows and Linux buildbots. There were a lot of different issues.
Thank you for doing this!
Maybe it's time to move more 3.x buildbots to the "stable" category? http://buildbot.python.org/all/waterfall?category=3.x.stable
A few months ago, I put together a list of suggestions for updating the stable/unstable list, but never got around to implementing it.
We have many offline buildbots. What's the status of these buildbots? Should we expect that they come back soon?
My Windows 8.1 bot is a VM that resides on a machine that has been disturbingly unstable lately, and it's starting to seem like the instability is due to that VM. I hope to have it back up (and stable) again soon, but have no timetable for it. My Docs bot was off after losing power over the weekend, and I just hadn't noticed yet. It's back now. I'll ping the python-buildbots list about other offline bots.
Or would it be possible to hide them? It would help to check the status of all buildbots.
I'm not sure, but that would be a nice feature.
- the 4 ICC buildbots are failing with stack overflow, segfault, etc. Again, I'm not sure that these buildbots are useful since it looks like we don't support this compiler yet. Or does it help to work on supporting this compiler? Who is working on ICC support?
The Ubuntu ICC bot is generally quite stable. The OSX ICC bot is currently offline, but has only a couple of known issues. The Windows ICC bot is still a bit experimental, but has inched closer to producing a working build. R. David Murray and I have been working with Intel on ICC support.
By the way, I'm always surprised by the huge difference of time needed to run a build on the different slaves: from a few minutes to more than 3 hours. The fatest Windows slave takes 28 minutes (run tests in parallel using 4 child processes), whereas the 3 others (run tests sequentially and) take between 2 hours and more than 3 hours! Why running tests on Windows takes so long?
Most of that is down to debug mode; building Python in debug mode links with the debug CRT which also enables all manner of extra checks. When it's up, the non-debug Windows bot also runs the test suite in ~28 minutes, running sequentially. --- After receiving a suggestion from koobs several months ago, I've been intermittently thinking about completely redoing our buildmaster setup such that instead of a single builder per version on each slave, we instead set up a series of builders with particular 'tags', and each builder attaches to each slave that satisfies the tags (running each build only on the first slave available). This would allow us to test some of the rarer options (such as --without-threads) significantly more often than 'never', and generally get a lot more customization/flexibility of builds. I haven't had a chance to sit down and think out all the edge cases of this idea, but what do people generally think of it? I think the GitHub switchover will be a good time to do this if it's generally seen as a decent idea, since there will need to be some work on the buildmaster to do the switch anyway. -- Zach
On Wed, 13 Apr 2016 at 13:17 Zachary Ware <zachary.ware+pydev@gmail.com> wrote:
[SNIP] ---
After receiving a suggestion from koobs several months ago, I've been intermittently thinking about completely redoing our buildmaster setup such that instead of a single builder per version on each slave, we instead set up a series of builders with particular 'tags', and each builder attaches to each slave that satisfies the tags (running each build only on the first slave available). This would allow us to test some of the rarer options (such as --without-threads) significantly more often than 'never', and generally get a lot more customization/flexibility of builds. I haven't had a chance to sit down and think out all the edge cases of this idea, but what do people generally think of it? I think the GitHub switchover will be a good time to do this if it's generally seen as a decent idea, since there will need to be some work on the buildmaster to do the switch anyway.
So we have slaves connect to multiple builders who have requirements of what they are testing? So the --without-threads master would have all slaves able to compile --without-threads connect to it and then do that build? And those same slaves may also connect to the gcc and clang masters to do those builds as well? So would that mean slaves could potentially do a bunch of builds per change? That sounds nice to me as long as the slave maintainers are also up to utilizing this by double/triple/quadrupling their builds.
On 13.04.16 14:40, Victor Stinner wrote:
Last months, most 3.x buildbots failed randomly. Some of them were always failing. I spent some time to fix almost all Windows and Linux buildbots. There were a lot of different issues.
Excelent! Many thanks for doing this. And new features of regrtest look nice.
So please try to not break buildbots again and remind to watch them sometimes:
http://buildbot.python.org/all/waterfall?category=3.x.stable&category=3.x.unstable
A desirable but nonexistent feature is to write emails to authors of commits that broke buildbots. How hard to implement this?
Next weeks, I will try to backport some fixes to Python 3.5 (if needed) to make these buildbots more stable too.
Python 2.7 buildbots are also in a sad state (ex: test_marshal segfaults on Windows, see issue #25264). But it's not easy to get a Windows with the right compiler to develop on Python 2.7 on Windows.
What are you think about backporting recent regrtest to 2.7? Most needed features to me are the -m and -G options.
Maybe it's time to move more 3.x buildbots to the "stable" category? http://buildbot.python.org/all/waterfall?category=3.x.stable
+1
By the way, I don't understand why "AMD64 OpenIndiana 3.x" is considered as stable since it's failing with multiple issues since many months and nobody is working on these failures. I suggest to move this buildbot back to the unstable category.
I think the main cause is the lack of memory in this buildbot. I tried to minimize memory consumption and leaks, but some leaks are left, and they provoke other tests failures, and additional resource leaks. Would be nice to add a feature for running every test in separate subprocess. This will isolate the effect of failed tests.
On 14 April 2016 at 09:15, Serhiy Storchaka <storchaka@gmail.com> wrote:
On 13.04.16 14:40, Victor Stinner wrote:
By the way, I don't understand why "AMD64 OpenIndiana 3.x" is considered as stable since it's failing with multiple issues since many months and nobody is working on these failures. I suggest to move this buildbot back to the unstable category.
I think the main cause is the lack of memory in this buildbot. I tried to minimize memory consumption and leaks, but some leaks are left, and they provoke other tests failures, and additional resource leaks. Would be nice to add a feature for running every test in separate subprocess. This will isolate the effect of failed tests.
Last time I looked into the Open Indiana buildbot, I concluded that the biggest problem was Python using fork() to spawn subprocesses. I understand that OS does not do “memory overcommitment” like Linux does, so every time you fork, the OS has to double the amount of memory that is reserved. It is ironic, but running each test using the current subprocess module (which uses fork) would probably make the problem worse. I suspect using posix_spawn() if possible would help a lot. But this was rejected in <https://bugs.python.org/issue20104> for not being flexible enough and making maintainence too complicated.
On Wed, Apr 13, 2016 at 4:40 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Last months, most 3.x buildbots failed randomly. Some of them were always failing. I spent some time to fix almost all Windows and Linux buildbots. There were a lot of different issues.
So please try to not break buildbots again and remind to watch them sometimes:
Piling in my thanks again, Victor. This is a great gesture from you to fix all the build bots. Keeping them stable is a proper thing to do and should be expected from all committers. -- Senthil
participants (10)
-
Brett Cannon
-
Chris Angelico
-
Eric V. Smith
-
Martin Panter
-
Senthil Kumaran
-
Serhiy Storchaka
-
Stefan Krah
-
Tim Golden
-
Victor Stinner
-
Zachary Ware