[Python-Dev] buildbot.python.org down again?
nad at acm.org
Sat Jul 12 03:04:14 CEST 2014
In article <62321D60-1197-47A5-B455-6E5200DD52F7 at stufft.io>,
Donald Stufft <donald at stufft.io> wrote:
> On Jul 8, 2014, at 12:58 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > On 7 Jul 2014 10:47, "Guido van Rossum" <guido at python.org> wrote:
> > > It would still be nice to know who "the appropriate persons" are. Too
> > > much of our infrastructure seems to be maintained by house elves or the
> > > ITA.
> > I volunteered to be the board's liaison to the infrastructure team, and
> > getting more visibility around what the infrastructure *is* and how it's
> > monitored and supported is going to be part of that. That will serve a
> > couple of key purposes:
> > - making the points of escalation clearer if anything breaks or needs
> > improvement (although "infrastructure at python.org" is a good default choice)
> > - making the current "todo" list of the infrastructure team more visible
> > (both to calibrate resolution time expectations and to provide potential
> > contributors an idea of what's involved)
> > Noah has already set up http://status.python.org/ to track service status,
> > I can see about getting buildbot.python.org added to the list.
> We (the infrastructure team) were actually looking earlier about
> buildbot.python.org and we're not entirely sure who "owns"
> Unfortunately a lot of the *.python.org services are in a similar state where
> there is no clear owner. Generally we've not wanted to just step in and take
> over for fear of stepping on someones toes but it appears that perhaps
> buildbot.p.o has no owner?
In parallel to this discussion, I ran into Noah at a meeting the other
day and we talked a bit about buildbot.python.org. As Donald noted, it
sounds like he and the infrastructure team are willing to add it to the
list of machines they monitor and reboot, as long as they wouldn't be
expected to administer the buildbot master itself. I checked with
Antoine and Martin and they are agreeable with that. So I think there
is general agreement that the infrastructure team can take on uptime
monitoring and rebooting of buildbot.python.org and that Antoine/Martin
would be the primary/secondary contacts/owners for other administrative
issues. Martin would also be happy if the infrastructure team could
handle installing routine security fixes as well. I'll leave it to the
interested parties to discuss it further among themselves.
nad at acm.org
More information about the Python-Dev