[Spambayes] No default check for missing subject header?

Tim Peters tim.peters at gmail.com
Sun Jan 16 19:25:34 CET 2005


[Mathew Hendry]
> I've just noticed that there's currently no specific token (by
> default in 1.0.1 anyway) indicating a lack of a subject header.

This should happen if you set the record_header_absence Tokenizer
option to True.  As the comment in Options.py says,

    When True, generate a "noheader:HEADERNAME" token for
    each header in safe_headers (below) that *doesn't* appear in
    the headers.  This helped in various of Tim's python.org tests,
    but appeared to hurt a little in Anthony Baxter's tests.

Since it had mixed results in testing, it's not enabled by default in
general.  It _may_ be enabled by default in the Outlook addin, though
(can't remember -- it's certainly enabled in my .ini file <wink>).

> I've received quite a lot of blank spam recently - mostly from
> severely broken spamware by the looks of it; many with headers
> misplaced in the body - and this would add to the scant clues.
> Mail with nearly all headers missing also seems to be a
> common characteristic of dictionary attack probes.

Yup, and record_header_absence also synthesizes tokens for the lack of
a Date line, From line, To line, Received line, etc.

> I changed the subject handling in tokenizer.py to yield a
> subject:none token in such cases, but haven't tried retraining yet
> to see exactly how much difference it makes.

I bet record_header_absence would help you.


More information about the Spambayes mailing list