[spambayes-dev] RE: [Spambayes-checkins]
spambayes/spambayesmessage.py, 1.39, 1.40 storage.py, 1.35, 1.36
mhammond at skippinet.com.au
Wed Oct 8 01:07:49 EDT 2003
> After the last round of worm spew subsided, I switched all my
> (3) Outlook
> classifiers back to using bsddb3 and (now) Python 2.3.2. I
> still haven't
> seen any database corruption, and I get about 800 emails each
> day now. In
> addition, I've been doing a lot of "real development work" on
> Win98SE lately
> (well, *trying* to do real work <wink>), and Outlook
> typically suffers from
> forced OS reboots several times each day. That is, we're not only not
> shutting down cleanly then, we're not getting to do *any*
> shutdown cleanup
> then. Sometimes Outlook 2K itself takes 10 minutes to come
> up again after a
> reboot (that's got nothing to do with spambayes, btw -- it's
> always behaved
> this way), but the Berkeley db never complains.
FWIW, my experience is similar. I use Win2k so never lose the OS itself,
but *regularly* kick outlook in the nuts via the "Task Manager". I've never
seen this error, or any other bsd related error either.
I haven't followed this closely enough, but is it possible it depends on the
specific sleepycat version (and therefore indirectly on the specific Python
version or source)?
> C:\WINDOWS\Application Data\SpamBayes>dir *.db
> DEFAUL~1 DB 2,621,440 10-07-03 10:31p default_bayes_database.db
> DEFAUL~2 DB 98,304 10-07-03 10:31p
> 2 file(s) 2,719,744 bytes
Mine are almost exactly twice that size, so still in the same league.
However, as Tony says, this is *mainly* for non-Outlook users, so I expect
their training patterns to be different. Eg, I believe we now can train on
Outlook Express files - but presumably this training will process *every*
message, rather than single folders. I don't know enough about the proxy,
but I suspect you may be on the right track that the "average" db size for
Outlook users is radically different to other users.
Digging a little more, bug
702&atid=498103 is nice enough to have a log from an Outlook session with
this error - of note:
Bayes database initialized with 297 spam and 21252 good messages
*** - message database has 21269 messages - bayes has 21549 - something is
Implying a large database is being used in that case at least.
More information about the spambayes-dev