[Spambayes] Upgrade problem

Sjoerd Mullender sjoerd@acm.org
Thu Nov 7 14:34:45 2002


On Thu, Nov 7 2002 Just van Rossum wrote:

> François Granger wrote:
> 
> > > after each message you have to wait (up to 10
> > > seconds on my machine with my database) before you can continue. May=
be an
> > > explicit "Save database" button is an idea?
> > 
> > With the -d parameter, you can use a anydbm instead of Pickle. With so=
me
> > hack it can probably use gdbm as the anydbm db.
> 
> Ok, so I did it. With my current setup anydbm uses dbhash/bsddb, and tra=
ining
> (on a single message) performance seems _worse_ than with the pickle (ab=
out 20
> seconds now, around 10 with pickle). Don't know whether the training its=
elf is
> slower or updating the database. Training with my entire corpus took man=
y times
> longer as well. Not to mention that the database is now 20 megs instead =
of 5...
> Would gdbm be expected to work faster? (I currently don't even have it.)=


The problem with training is that the update_probabilities() method
which is called at the end goes through the whole database and updates
just about every word.  So the whole database is touched and needs to
be written to disk.

-- Sjoerd Mullender <sjoerd@acm.org>



More information about the Spambayes mailing list