[Spambayes] mboxtrain.py chokes on bugtraq email messages
T. Alexander Popiel
popiel at wolfskeep.com
Mon Apr 14 12:07:05 EDT 2003
In message: <GCXV83TNRQ1THDWTGCOM61SMGDOM4X87.3e9af5c6 at myst>
<tim at fourstonesExpressions.com> writes:
>This is a multipart/digest message, and a known problem. Keep an eye out for
>the fix checkin. It'll get fixed one of these days.
Here's a question: what is the proper behaviour for these messages?
Should the entire message get a ham/spam score, should the individual
sub-messages get their own scores, or both? If both, how should the
individual scores be combined into the overall score? Should the digest
be broken into multiple messages: one containing ham, one containing
spam, and one containing unsure?
My initial impulse is to score each sub-message individually, and if
any of them are ham, mark the entire thing as ham. If none are ham,
but some are unsure, mark the overall message as unsure. Otherwise
mark it as spam. As to the debug clue headers, I have no idea how
to handle them...
More information about the Spambayes