[Spambayes] No default check for missing subject header?
tim.peters at gmail.com
Sun Jan 16 19:25:34 CET 2005
> I've just noticed that there's currently no specific token (by
> default in 1.0.1 anyway) indicating a lack of a subject header.
This should happen if you set the record_header_absence Tokenizer
option to True. As the comment in Options.py says,
When True, generate a "noheader:HEADERNAME" token for
each header in safe_headers (below) that *doesn't* appear in
the headers. This helped in various of Tim's python.org tests,
but appeared to hurt a little in Anthony Baxter's tests.
Since it had mixed results in testing, it's not enabled by default in
general. It _may_ be enabled by default in the Outlook addin, though
(can't remember -- it's certainly enabled in my .ini file <wink>).
> I've received quite a lot of blank spam recently - mostly from
> severely broken spamware by the looks of it; many with headers
> misplaced in the body - and this would add to the scant clues.
> Mail with nearly all headers missing also seems to be a
> common characteristic of dictionary attack probes.
Yup, and record_header_absence also synthesizes tokens for the lack of
a Date line, From line, To line, Received line, etc.
> I changed the subject handling in tokenizer.py to yield a
> subject:none token in such cases, but haven't tried retraining yet
> to see exactly how much difference it makes.
I bet record_header_absence would help you.
More information about the Spambayes