I spent some time yesterday looking into the intermittent test failures in
test_tcp and test_threadpool. These failures only occurred when the full
test suite was run (./admin/runtests -v -t), not when test_tcp was run by
itself (./admin/runtests -v -t twisted.test.test_tcp). I finally figured out
what was happening and wanted to share my findings to help others track down
Early in the test sequence, test_woven did some kind of session/login
thing. One particular test case was verifying an Unauthorized response,
which involved creating a Deferred that was then .errback()'ed. However,
there was no errback handler added to the Deferred.
Inside that test, web.guard started a 60 second session timeout, with a
callLater() function that referenced that Deferred (or an object holding
test_woven finishes. The timer is still running. The Deferred (with pending
error) is still alive, since the callLater timer prevents it from being
Other test cases run.
[60 seconds pass]
Some other innocent test case is running, depending upon the speed of the
machine. On my CPU, it was usually test_tcp.
During some call to reactor.iterate, the web.guard session timer expires,
the callback does whatever callbacky things it likes, and an object gets
freed. That leads to the Deferred being freed. The Deferred has a pending
error but no .errback handler. Deferred.__del__ does log.msg() with the
"Unhandled error in Deferred" message, and log.err() with the
"Unauthorized" Failure object that wanted an .errback but had none.
log.err() puts the message in the error queue. log.msg() goes into the
test.log file, but not into the error queue.
test_tcp finishes without problem, but unittest.py looks in the error queue
and sees the Failure. It adds the traceback of the Failure (which is the
call trace leading up to the creation of the Unauthorized exception, which
is in web.guard, and doesn't mention test_woven at all) to the error
results of the test.
test_tcp is flunked, and as explanation unittest.py emits the error results
(the Unauthorized exception and the web.guard stack trace)
The basic issue is timers that "leak" over from one unit test to another.
Unit tests are supposed to be isolated from each other, but if one test
changes the state of the reactor (by adding timers or establishing listenTCP
ports) without cleaning it up when it finishes, that state can affect other
This is not always obvious, as several modules create timers internally.
This is very convenient, and works fine in normal applications. It is only
in the unit tests where it causes a problem.
I've added IReactorTime.getDelayedCalls() as a reactor-portable way to
retrieve the list of pending callLater() timers, and changed unittest.py to
cancel all pending timers between test cases. This succeeded in moving the
failure back to the test that caused it. I've also added a warning message
(which goes into test.log) that indicates when a test leaves timers running
after it completes. There are a handful that do; most are benign but should
probably be cleaned up. This warning message may turn into a test failure
Some tools I used to track this down, in case they prove useful to others:
added code to Deferred and DelayedCall to record the stack track when they
are created, then print them in __repr__ or added to the "unhandled error
in deferred" Failure. The traceback of creation is often more useful than
the traceback of .callback, .errback, or timer execution.
added .extra attribute to Failures, insert its contents in the traceback.
I used this to display the Deferred-creation traceback.
There are still a few issues: I need to implement getDelayedCalls for the
cReactor, and my unit test for getDelayedCalls is stymied by a recurring
timer created when the 'popsicle' module is imported, but in general it
seems to make the unit test suite behave sensibly (no more intermittent