
It is unfortuantely entirely possible that various berkeleydb libraries have bugs. Since the BerkeleyDB db->del() call isn't returning it is presumably stuck in a lock waiting for who knows what.
Right. But the SAME berkeley db library is being used for my build of both Python 2.4 alpha 0, and 2.3 maintenance branch, both from cvs, and I can't see any difference in what they're doing with bsddb -- so clearly I must be missing something because it's hanging on EVERY attempt to run the unittest w/2.4, but never w/2.3.
The big difference i see between 2.3cvs and 2.4cvs that could "explain" it is that Lib/bsddb/__init__.py has been updated to use a private (in memory, single process only) DBEnv with locking and thread support enabled. That explains why db->del() would be doing locking. But not why it would deadlock. This is also easily reproducable here. No special platform or berkeleydb version should be required. Looking closer I suspect what is happening is that Lib/bsddb/__init__.py implementation is not threadsafe. It wants to maintain the current iterator location using a DBCursor object. However, having an active DBCursor holds a lock in the database. DictMixin's popitem() is effectively: k, v = self.iteritems().next() del self[k] return (k, v) The iteritems() call creates an internal DBCursor object for the iterator. The next() call on the iterator (DBCursor) looks up the value for k. The following delete attempts to delete the record without using the DBCursor; thus the deadlock. If we implement our own popitem() for the bsddb dictionary object (_DBWithCursor) to perform the delete using the cursor this deadlock in the unit tests would go away. That won't stop users from intermixing iteration over a database with modifications to the database; causing their own deadlocks (very unexpected in single threaded code). Proposed fix: It should be possible for the bsddb object to maintain internal state of its own about what key is is on and close any internal DB cursor on all non-cursor database accesses leaving the iteration functions to detect this and reopen and reposition the cursor. Since the basic bsddb interface doesn't allow databases with duplicate keys it shouldn't be too difficult. Its not efficient but a user who cares about efficient use of berkeleydb should use the real DB/DBEnv interface directly. How do python dictionaries deal with modifications to the dictionary intermixed with iteration? Greg