[spambayes-dev] A new and altogether different bsddb breakage

Richie Hindle richie at entrian.com
Wed Dec 17 18:25:45 EST 2003

[Barry, responding to Tim]
> the BerkeleyDB based storages use the full-blown bsddb
> transactional interface, so from that side of things, they should be
> thread and multiproc safe.

> Note that there is a "full" BDB storage and a "minimal" storage.  The
> latter doesn't retain multiple revisions.

Fantastic.  So in theory at least...

 o All the SpamBayes programs could use BDB-backed ZODB instead of
   directly using bsddb.

 o They would automatically work nicely together with a single writer (eg.
   sb_server is training while sb_filter is classifying), and with a bit
   more work catching ConflictErrors, we could even have multiple writers.

 o The database wouldn't get significantly bigger than with direct use of

 o Since BDB uses bsddb in transaction mode rather than single-file mode,
   we can say goodbye to those nasty little DBRunRecovery errors.  Yay!

Tim, did this:
> I'm half ready to declare that ZODB is the only database anyone should
> ever use

apply to BDB-backed ZODB, or only to ZODB's native storage?

Unless there's something I'm missing (licensing problems, deployment
problems, portability problems...?) it could be that we should replace our
current DBDictClassifier (which suffers from DBRunRecovery errors and
isn't multiprocess-safe) with a ZODBClassifier using a BDB back end.  From
a position of complete ignorance, I'd hazard a guess that the
implementation would end up a lot simpler than rewriting DBDictClassifier
to use bsddb in full-on transactional mode - the hassles of doing that
have already been sorted out in ZODB.

Am I in cloud cuckoo land?

Richie Hindle
richie at entrian.com

More information about the spambayes-dev mailing list