[Spambayes] hammie, pop3proxy, and persistent_use_database
Neale Pickett
neale@woozle.org
Tue Nov 19 17:46:27 2002
It seems like we're getting a fair amount of people using hammie who
just want it to filter their mail. These folks, I am guessing, are just
accepting the default values for things, assuming those must be a good
place to start.
Unfortunately, if you're running hammie out of procmail, the pickle
method is going to start to get really slow as your training set gets
larger. As fast as the pickler is, it's still having to slurp in the
entire file every time you run it. I'm talking several orders of
magnitude here.
On the other hand, pop3proxy probably works best when using a pickle,
since it starts up once and can score many emails. hammiesrv works
similarly, but I don't think anyone is using that :)
So, what would you say to moving the persistent_use_database option into
per-service configuration? Specifically:
Index: Options.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/Options.py,v
retrieving revision 1.72
diff -u -r1.72 Options.py
--- Options.py 18 Nov 2002 19:14:48 -0000 1.72
+++ Options.py 19 Nov 2002 17:44:16 -0000
@@ -348,10 +348,11 @@
# The default database path used by hammie
persistent_storage_file: hammie.db
-# hammie can use either a database (quick to score one message) or a pickle
-# (quick to train on huge amounts of messages). Set this to True to use a
-# database by default.
-persistent_use_database: False
+[hammiefilter]
+# hammiefilter can use either a database (quick to score one message) or
+# a pickle (quick to train on huge amounts of messages). Set this to
+# True to use a database by default.
+hammiefilter_persistent_use_database: False
[pop3proxy]
# pop3proxy settings - pop3proxy also respects the options in the
# Hammie
@@ -366,6 +367,7 @@
pop3proxy_spam_cache: pop3proxy-spam-cache
pop3proxy_ham_cache: pop3proxy-ham-cache
pop3proxy_unknown_cache: pop3proxy-unknown-cache
+pop3proxy_persistent_use_database: False
[html_ui]
html_ui_port: 8880
@@ -440,6 +442,8 @@
'hammie_debug_header': boolean_cracker,
'hammie_debug_header_name': string_cracker,
},
+ 'hammiefilter' : {'hammiefilter_persistent_use_database':
boolean_cracker,
+ },
'pop3proxy': {'pop3proxy_server_name': string_cracker,
'pop3proxy_server_port': int_cracker,
'pop3proxy_port': int_cracker,
@@ -448,6 +452,7 @@
'pop3proxy_spam_cache': string_cracker,
'pop3proxy_ham_cache': string_cracker,
'pop3proxy_unknown_cache': string_cracker,
+ 'pop3proxy_persistent_use_database': string_cracker,
},
'html_ui': {'html_ui_port': int_cracker,
'html_ui_launch_browser': boolean_cracker,
More information about the Spambayes
mailing list