[Spambayes] hammie, pop3proxy, and persistent_use_database

Neale Pickett neale@woozle.org
Tue Nov 19 17:46:27 2002


It seems like we're getting a fair amount of people using hammie who
just want it to filter their mail.  These folks, I am guessing, are just
accepting the default values for things, assuming those must be a good
place to start.

Unfortunately, if you're running hammie out of procmail, the pickle
method is going to start to get really slow as your training set gets
larger.  As fast as the pickler is, it's still having to slurp in the
entire file every time you run it.  I'm talking several orders of
magnitude here.

On the other hand, pop3proxy probably works best when using a pickle,
since it starts up once and can score many emails.  hammiesrv works
similarly, but I don't think anyone is using that :)

So, what would you say to moving the persistent_use_database option into
per-service configuration?  Specifically:

Index: Options.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/Options.py,v
retrieving revision 1.72
diff -u -r1.72 Options.py
--- Options.py  18 Nov 2002 19:14:48 -0000      1.72
+++ Options.py  19 Nov 2002 17:44:16 -0000
@@ -348,10 +348,11 @@
 # The default database path used by hammie
 persistent_storage_file: hammie.db
 
-# hammie can use either a database (quick to score one message) or a pickle
-# (quick to train on huge amounts of messages). Set this to True to use a
-# database by default.
-persistent_use_database: False
+[hammiefilter]
+# hammiefilter can use either a database (quick to score one message) or
+# a pickle (quick to train on huge amounts of messages). Set this to
+# True to use a database by default.
+hammiefilter_persistent_use_database: False
 
 [pop3proxy]
 # pop3proxy settings - pop3proxy also respects the options in the
 # Hammie
@@ -366,6 +367,7 @@
 pop3proxy_spam_cache: pop3proxy-spam-cache
 pop3proxy_ham_cache: pop3proxy-ham-cache
 pop3proxy_unknown_cache: pop3proxy-unknown-cache
+pop3proxy_persistent_use_database: False
 
 [html_ui]
 html_ui_port: 8880
@@ -440,6 +442,8 @@
                'hammie_debug_header': boolean_cracker,
                'hammie_debug_header_name': string_cracker,
                },
+    'hammiefilter' : {'hammiefilter_persistent_use_database':
     boolean_cracker,
+                      },
     'pop3proxy': {'pop3proxy_server_name': string_cracker,
                   'pop3proxy_server_port': int_cracker,
                   'pop3proxy_port': int_cracker,
@@ -448,6 +452,7 @@
                   'pop3proxy_spam_cache': string_cracker,
                   'pop3proxy_ham_cache': string_cracker,
                   'pop3proxy_unknown_cache': string_cracker,
+                  'pop3proxy_persistent_use_database': string_cracker,
                   },
     'html_ui': {'html_ui_port': int_cracker,
                 'html_ui_launch_browser': boolean_cracker,




More information about the Spambayes mailing list