[spambayes-dev] training from IMAP folder? (with patch)

Tony Meyer tameyer at ihug.co.nz
Tue Jan 20 21:20:35 EST 2004


> The reason I need this feature (as opposed to the IMAP 
> filter) is to implement server-side spam filtering (using cyrus)
> and training which is intuitive for lay mail users.

I'm not sure why imapfilter wouldn't work here (i.e. users still drag mail
to train to particular folders, and on the server, imapfilter is run once
for each user (as mboxtrain presumably will be) set to train on the
appropriate folders (and not classify).  Be that as it may, I don't see any
problem with adding this to mboxtrain.

Comments on the patch:

Rather than doing this:
    message_flags = string.replace(message_flags, '\\Recent ', '')
You could do this:
    message_flags = message_flags.replace(message_flags, '\\Recent ', '')

This removes the need for importing string, and is, I believe, the more
'correct' way to do it.  The import of imaplib should be at the top, too,
according to the rather loose SpamBayes coding rules.

I'm also curious about whether the single space at the end has the same
effect if the \Recent flag is the only flag present and when it isn't the
only one.

Rather than getting the message headers and body separately, you could use
"RFC822" to get both together.  You could also use "BODY.PEEK[]" to get it
without setting the \Seen flag.  (sb_imapfilter.py needs to be updated to
use "BODY.PEEK[]").

Ideally, it would be great if mboxtrain and imapfilter used the same code to
do this.  It would save a lot of maintenance and hassle if that was the
case.  It's difficult to import from imapfilter, since it's in the scripts
directory, so I suppose the solution would be to create a new module in the
spambayes directory, and have both imapfilter and mboxtrain import from
that.  If you did that, you could just:

 1. Create an IMAPSession object (with server + port details).
 2. Call Login() on this object.
 3. For each of the folders to train
    a.  Create an IMAPFolder object
    b.  Call Train() on this object.
 4. Call Logout() on the IMAPSession object.

The only modifications that would need to be done are to match the standard
mboxtrain "include_trained" header option and remove trained option.  These
would be simple additions to the code, though.

Thoughts?

=Tony Meyer




More information about the spambayes-dev mailing list