[spambayes-dev] A new and altogether different bsddb breakage
Tim Peters
tim.one at comcast.net
Thu Dec 18 14:09:09 EST 2003
[Kenny Pitt]
> The bsddb package includes a dbshelve module that handles all the
> required dictionary access methods to provide compatibility with
> standard shelve functionality. It also allows specifying the DB_ENV
> when opening the database.
Speaking of which, 4 of the test_bsddb3.py tests fail on Win98SE with the
soon-to-be-released Python 2.3.3 (which is at least as well as that test has
ever done on that platform). The 4 failing tests all exercise the dbshelve
module:
ERROR: test01_basics (bsddb.test.test_dbshelve.EnvBTreeShelveTestCase)
ERROR: test01_basics (bsddb.test.test_dbshelve.EnvHashShelveTestCase)
ERROR: test01_basics (bsddb.test.test_dbshelve.EnvThreadBTreeShelveTestCase)
ERROR: test01_basics (bsddb.test.test_dbshelve.EnvThreadHashShelveTestCase)
and all die with the same traceback and error:
Traceback (most recent call last):
File "C:\CODE\23\lib\bsddb\test\test_dbshelve.py", line 75, in
test01_basics
self.do_open()
File "C:\CODE\23\lib\bsddb\test\test_dbshelve.py", line 238, in do_open
self.env.open(homeDir, self.envflags | db.DB_INIT_MPOOL | db.DB_CREATE)
DBAgainError: (11, 'Resource temporarily unavailable -- unable to join the
environment')
If that isn't just an artifact of something else the test suite is doing,
it's enough to kill the idea of using dbshelve on Windows.
> The only thing it doesn't seem to handle is transactions, but I'm not
> convinced we need that.
>
> Transactions are only really important if you are updating several
> related entries, and need to be able to rollback the whole lot if any
> one of them fails.
I expect a transaction commit supplies a natural and useful boundary for
doing a database checkpoint operation (see earlier email w/ Barry; making
frequent checkpoints is probably important so that running recovery when the
database is opened runs quickly).
> ...
> The important thing re our suspected cause would be the multi-thread
> and multi-process locking, and that can be used independently of
> transactions.
Gregory Smith found and fixed several bugs in the bsddb3 use-it-like-a-dict
wrappers we've *been* using, all related to concurrent access.
Unfortunately, it doesn't look like anyone backported those fixes for the
Python 2.3 release (the last few checkins only exist on the trunk, which is
Python 2.4 development).
Given the history of bsddb3 support so far, I think we'll be best off using
the Berkeley-native APIs as directly as possible, avoiding "convenience
wrappers" like the plague. Very little of our code interacts with the
database directly, and bugs in those wrappers have probably caused hundreds
of times more hours of bug-chasing than would have been required to write a
few extra lines of lower-level code. Of course, using the Berkeley-native
API directly should run faster too, but I don't hold that it against it
*too* much <wink>.
More information about the spambayes-dev
mailing list