[Python-Dev] bsddb3 test hang

Barry Warsaw barry@python.org
21 Jul 2003 18:44:53 -0400


On Mon, 2003-07-21 at 18:15, Martin v. L=F6wis wrote:
> Barry Warsaw <barry@python.org> writes:
>=20
> > % strace -p 26615
> > futex(0x82fb8c0, FUTEX_WAIT, 0, NULL
>=20
> NPTL. 'Nuff said.

Perhaps, but there's more:

Tim, Fred, and I spent some time running tests, turning on verbosity,
etc.  I can only reproduce the hang if I run

% make TESTOPTS=3D"-u all" test

and then it hangs almost every time, the second pass through the tests.=20
It can't be interrupted.  If I turn on -v or hack the test to be
slightly more verbose, or use a less inclusive -u option, it never
hangs.  So clearly <wink> there's some timing issue in the tests.

Tim has had a little more luck on Windows by adding some verbosity and
watching it hang in dbutils.DeadlockWrap() -- or actually infloop in
that function's while 1.  This comes from bsddb/test/test_thread.py and
the only place where DeadlockWrap() gets called without passing in
max_retries (i.e. it runs forever) is in readerThread().  So I added
max_retries to both those calls and now I see DBLockDeadlockErrors in
the readerThreads and AssertionErrors in the writerThreads.  Fred's seen
these assertions and Tim says that Skip reported these assertions too.

I think I'm going to leave the max_retries in on the readerThread()
calls to DeadlockWrap().  That'll at least prevent the tests from
hanging.  We'll investigate a bit more to see if we can find out more
about the underlying cause of the assertion failures.  There's some
mystery there too.

-Barry