Worker failures and restart
Yesterday, a large number of buildbots entered an error state. There has been no comment here. I see that some of the failing buildbots have recovered. I don't know if they explicitly were restarted, e.g., cstratak-RHEL8-ppc64le. The AIX and MacOS builders remain paused.
I have restarted the AIX buildbot, but I cannot unpause the worker from the Buildbot scheduler. What needs to be done to resume activity?
Thanks, David
Traceback (most recent call last): File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks self._runCallbacks() File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks current.result = callback(current.result, *args, **kw) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1514, in gotResult current_context.run(_inlineCallbacks, r, g, status) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1445, in _inlineCallbacks result = current_context.run(g.send, result) --- <exception caught here> --- File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/process/build.py", line 403, in startBuild yield self.conn.remoteStartBuild(self.builder.name) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/worker/protocols/pb.py", line 315, in remoteStartBuild workerforbuilder = self.builders.get(builderName) builtins.AttributeError: 'Connection' object has no attribute 'builders'
I've checked the buildbot and it seems that no one ssh'ed into it while the issue was resolved so there was no intervention from the client side.
On Mon, Aug 16, 2021 at 2:47 PM David Edelsohn <dje.gcc@gmail.com> wrote:
Yesterday, a large number of buildbots entered an error state. There has been no comment here. I see that some of the failing buildbots have recovered. I don't know if they explicitly were restarted, e.g., cstratak-RHEL8-ppc64le. The AIX and MacOS builders remain paused.
I have restarted the AIX buildbot, but I cannot unpause the worker from the Buildbot scheduler. What needs to be done to resume activity?
Thanks, David
Traceback (most recent call last): File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks self._runCallbacks() File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks current.result = callback(current.result, *args, **kw) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1514, in gotResult current_context.run(_inlineCallbacks, r, g, status) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1445, in _inlineCallbacks result = current_context.run(g.send, result) --- <exception caught here> --- File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/process/build.py", line 403, in startBuild yield self.conn.remoteStartBuild(self.builder.name) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/worker/protocols/pb.py", line 315, in remoteStartBuild workerforbuilder = self.builders.get(builderName) builtins.AttributeError: 'Connection' object has no attribute 'builders'
-- Regards,
Charalampos Stratakis Senior Software Engineer Python Maintenance Team, Red Hat
The traceback looks like a legit bug in Buildbot. You can try to report it to Buildbot: https://github.com/buildbot/buildbot
Victor
On Mon, Aug 16, 2021 at 2:47 PM David Edelsohn <dje.gcc@gmail.com> wrote:
Yesterday, a large number of buildbots entered an error state. There has been no comment here. I see that some of the failing buildbots have recovered. I don't know if they explicitly were restarted, e.g., cstratak-RHEL8-ppc64le. The AIX and MacOS builders remain paused.
I have restarted the AIX buildbot, but I cannot unpause the worker from the Buildbot scheduler. What needs to be done to resume activity?
Thanks, David
Traceback (most recent call last): File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks self._runCallbacks() File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks current.result = callback(current.result, *args, **kw) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1514, in gotResult current_context.run(_inlineCallbacks, r, g, status) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1445, in _inlineCallbacks result = current_context.run(g.send, result) --- <exception caught here> --- File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/process/build.py", line 403, in startBuild yield self.conn.remoteStartBuild(self.builder.name) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/worker/protocols/pb.py", line 315, in remoteStartBuild workerforbuilder = self.builders.get(builderName) builtins.AttributeError: 'Connection' object has no attribute 'builders'
Python-Buildbots mailing list -- python-buildbots@python.org To unsubscribe send an email to python-buildbots-leave@python.org https://mail.python.org/mailman3/lists/python-buildbots.python.org/ Member address: vstinner@python.org
-- Night gathers, and now my watch begins. It shall not end until my death.
Some Twisted logs. I replaced the IP address with <billenstein-macos> or <koobs-freebsd-9e36> in logs. I don't know what is the worker 52.179.5.160 which creates an Unicode error.
2021-08-16 12:34:45+0000 [Broker,<koobs-freebsd-9e36>] Got workerinfo from 'koobs-freebsd-9e36' 2021-08-16 12:34:45+0000 [Broker,<koobs-freebsd-9e36>] worker koobs-freebsd-9e36 cannot attach Traceback (most recent call last): File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1571, in _cancellableInlineCallbacks _inlineCallbacks(None, g, status) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks result = current_context.run(result.throwExceptionIntoGenerator, g) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator return g.throw(self.type, self.value, self.tb) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/worker/base.py", line 725, in attached log.err(e, "worker {} cannot attach".format(self.name)) --- <exception caught here> --- File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/worker/base.py", line 723, in attached yield super().attached(bot) builtins.AssertionError:
2021-08-16 12:59:21+0000 [Broker,<billenstein-macos>] ping finished: success 2021-08-16 12:59:21+0000 [Broker,<billenstein-macos>] while start_build Traceback (most recent call last): File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks self._runCallbacks() File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks current.result = callback(current.result, *args, **kw) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1514, in gotResult current_context.run(_inlineCallbacks, r, g, status) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/internet/defer.py", line 1445, in _inlineCallbacks result = current_context.run(g.send, result) --- <exception caught here> --- File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/process/build.py", line 403, in startBuild yield self.conn.remoteStartBuild(self.builder.name) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/worker/protocols/pb.py", line 315, in remoteStartBuild workerforbuilder = self.builders.get(builderName) builtins.AttributeError: 'Connection' object has no attribute 'builders'
2021-08-16 13:00:43+0000 [Broker,12154,52.179.5.160] Peer will receive following PB traceback: 2021-08-16 13:00:43+0000 [Broker,12154,52.179.5.160] Unhandled Error Traceback (most recent call last): File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/spread/banana.py", line 176, in gotItem self.callExpressionReceived(item) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/spread/banana.py", line 137, in callExpressionReceived self.expressionReceived(obj) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/spread/pb.py", line 602, in expressionReceived method(*sexp[1:]) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/spread/pb.py", line 996, in proto_message self._recvMessage( --- <exception caught here> --- File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/spread/pb.py", line 1050, in _recvMessage netResult = object.remoteMessageReceived(self, message, netArgs, netKw) File "/srv/buildbot/venv/lib/python3.9/site-packages/twisted/spread/flavors.py", line 131, in remoteMessageReceived state = method(*args, **kw) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/process/remotecommand.py", line 187, in remote_update updates = decode(updates) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/pbutil.py", line 169, in decode return data_type(map(decode, data)) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/pbutil.py", line 169, in decode return data_type(map(decode, data)) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/pbutil.py", line 169, in decode return data_type(map(decode, data)) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/pbutil.py", line 169, in decode return data_type(map(decode, data)) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/pbutil.py", line 165, in decode return bytes2unicode(data, encoding, errors) File "/srv/buildbot/venv/lib/python3.9/site-packages/buildbot/util/__init__.py", line 272, in bytes2unicode return str(x, encoding, errors) builtins.UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 489: invalid continuation byte
participants (3)
-
Charalampos Stratakis
-
David Edelsohn
-
Victor Stinner