Mark Saprio wrote:
The messages are being discarded because they have an
X-Spam-Status: Yes
Mark, this isn't strictly correct, I think. cre.search() is going to look for any place in the string where the regex matches, so they're *actually* being discarded because they have a header: X-Spam-Status: .*Yes.*
(Assuming the leading space is stripped when the header value is stored in a message- this seems like reasonable behaviour to me, but I'm not sure what the protocol says about spaces there.)
Now, here's the problem with this.
X-Spam-Status for a non-spam message may look like:
X-Spam-Status: No, score=-5.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 ...
(and keeps going for a while.)
As should be pretty obvious, 'Yes' case-insensitively is found in 'BAYES'. This won't occur with the header_filter_rules, because they match the header as a line, rather than treating the value separately. So, as an alternative, it should be possible to use a KNOWN_SPAMMER of ('X-Spam-Status', '^Yes').
On 3/1/06, Joel Heenan <joel.heenan@sensorynetworks.com> wrote:
I'll do some research if I get time today but I'm fairly sure something is borked with this spam filtering. Looking through the code I can see that if a header is not found its not supposed to return a spam match. My SpamDetect module looks alright (to my largely python-ignorant eyes). See above for why this wasn't working. I think it's pretty unlikely that any of your messages /don't/ have the X-Spam-Status, assuming you're doing your own scanning.
- Patrick Bogen