
On 12:16 am, matusis@yahoo.com wrote:
I upgraded to 9.0.0 and I am now seeing a new error, not present in 8.2.0 or earlier:
[snip] "/usr/local/encap/python-2.6.4/lib/python2.6/site- packages/Twisted-9.0.0-py2 .6-linux-x86_64.egg/twisted/internet/abstract.py", line 267, in stopWriting self.reactor.removeWriter(self) File "/usr/local/encap/python-2.6.4/lib/python2.6/site- packages/Twisted-9.0.0-py2 .6-linux-x86_64.egg/twisted/internet/epollreactor.py", line 145, in removeWriter self._remove(writer, self._writes, self._reads, self._selectables, _epoll.OUT, _epoll.IN) File "/usr/local/encap/python-2.6.4/lib/python2.6/site- packages/Twisted-9.0.0-py2 .6-linux-x86_64.egg/twisted/internet/epollreactor.py", line 131, in _remove self._poller._control(cmd, fd, flags) File "_epoll.pyx", line 125, in _epoll.epoll._control
exceptions.IOError: [Errno 2] No such file or directory
The error is highy intemittent and occurs only under high connection client rate. Any idea of what this could be?
Translating into English, a descriptor being monitored for writeability is being removed from the reactor, but epoll thinks it isn't being monitored in the first place. It seems likely this is caused by an attempt to double remove something. However, why that would happen will probably take a bit more digging. There was one direct change to epollreactor.py between 8.2 and 9.0: http://twistedmatrix.com/trac/changeset/26118#file1 It was to reactor shutdown code, though, so it seems like it probably isn't coming in to play in your case. A number of other indirect changes were made, though (eg to the epoll reactor's base classes or other support code it uses). It's conceivable one of these introduced the problem. One could also imagine that the problem existed all along, and one of the changes merely nudged some race condition and now it's going badly for your app. As far as suggestions for how to track this down go... Well, minimizing the example is always nice. ;) Aside from that, one idea that presents itself to me is to instrument the reactor to record addWriter/removeWriter events, and then log the complete stream of them for a particular writer when a double removeWriter is attempted. Initially you might just track that they happen, and use the result to confirm or reject the double removeWriter hypothesis. If it holds up, it might be useful to add stack recording, in order to see why things are happening. It may even be easy to implement this as a tiny reactor wrapper, which would make it easier to deploy and enable/disable. If this doesn't disrupt your production environment overly, it might be worth trying. Keep us updated. :) Jean-Paul