[Python-Dev] Possible language summit topic: buildbots
"Martin v. Löwis"
martin at v.loewis.de
Sun Oct 25 19:57:48 CET 2009
Antoine Pitrou wrote:
>> For a), I think we can solve this only by redundancy, i.e. create more
>> build slaves, hoping that a sufficient number would be up at any point
>> in time.
> We are already doing this, aren't we?
> It doesn't seem to work very well, it's a bit like a Danaides vessel.
Both true. However, it seems that Mark is unhappy with the current set
of systems, so we probably need to do it again.
> Well, to be fair, buildbots breaking also happens much more frequently
> (perhaps one or two orders of magnitude) than the SVN server or the Web
> site going down. Maintaining them looks like a Sisyphean task, and nobody
> wants that.
It only looks so. It is like any server management task - it takes
constant effort. However, it is not Sisyphean (feeling Greek today,
ain't you :-); since you actually achieve something. It's not hard to
restart a buildbot when it has crashed, and it gives a warm feeling of
having achieved something.
> I don't know what kind of machines are the current slaves, but if they
> are 24/7 servers, isn't it a bit surprising that the slaves would go down
> so often? Is the buildbot software fragile?
Not really. It sometimes happens that the slaves don't reconnect after
a master restart, but more often, it is just a change on the slave side
that breaks it (such as a reboot done to the machine, and not having
the machine configured to restart the slave after the reboot).
> Does it require a lot of
> (maintenance, repair) work from the slave owners?
On Unix, not really. On Windows, there is still the issue that
sometimes, some error message pops up which you need to click away.
Over several builds, you may find that you have to click away dozens
of such messages. This could use some improvement.
More information about the Python-Dev