
Mark Dickinson wrote:
Would it be worth spending some time discussing the buildbot situation at the PyCon 2010 language summit? In the past, I've found the buildbots to be an incredibly valuable resource; especially when working with aspects of Python or C that tend to vary significantly from platform to platform (for me, this usually means floating-point, and platform math libraries, but there are surely many other things it applies to). But more recently there seem to have been some difficulties keeping a reasonable number of buildbots up and running. A secondary problem is that it can be awkward to debug some of the more obscure test failures on buildbots without having direct access to the machine. From conversations on IRC, I don't think I'm alone in wanting to find ways to make the buildbots more useful.
These are actually two issues: a) where do we get buildbot hardware and operators? b) how can we reasonably debug problems occurring on buildbots For a), I think we can solve this only by redundancy, i.e. create more build slaves, hoping that a sufficient number would be up at any point in time. So: what specific kinds of buildbots do you think are currently lacking? A call for volunteers will likely be answered quickly.
So the question is: how best to invest time and possibly money to improve the buildbot situation (and as a result, I hope, improve the quality of Python)?
I don't think money will really help (I'm skeptical in general that money helps in open source projects). As for time: "buildbot scales", meaning that the buildbot slave admins will all share the load, being responsible only for their own slaves. On the master side: would you be interested in tracking slave admins?
What could be done to make maintenance of build slaves easier?
This is something that only the slave admins can answer. I don't think it's difficult - it's just that people are really unlikely to contribute to the same thing over a period of five years at a steady rate. So we need to make sure to find replacements when people drop out.
Or to encourage interested third parties to donate hardware and time?
Again: I think a call for volunteers would do (Steve, if you are reading this, please hold back just a few days before actually making such a call :-)
Are there good alternatives to Buildbot that might make a difference?
I think people have started working on such a thing. There are certainly alternatives; I'm fairly skeptical that they are *good* alternatives (but then, I'm the one who set up the buildbot installation in the first place).
What do other projects do?
I think that's really difficult to compare, since their testing often has a very different scope. I think CruiseControl is widely used.
These are probably the wrong questions; I'm hoping that a discussion would help produce the right questions, and possibly some answers.
I think these are good questions - just not for the summit. Setting up such a system is, conceptually, easy. It's also just a little work to set it up initially; the difficult part then is to keep it running (and no, a system where anybody can just post test results at any time without prior registration is *still* difficult to keep running). The source of the problem is that such a system can degrade without anybody taking action. If the web server's hard disk breaks down, people panic and look for a solution quickly. If the source control is down, somebody *will* "volunteer" to fix it. If the automated build system produces results less useful, people will worry, but not take action. Regards, Martin