[Spambayes] Re: There Can Be Only One
Tim Peters
tim.one@comcast.net
Thu, 26 Sep 2002 15:51:10 -0400
>>> SPAM: * 2.2 -- From: has a malformed address
>> We have no code to catch that.
[Neil]
> Perhaps we could generate a token based on the form of the To and From
> addresses (e.g. if they include a real name and are valid based on
> RFCs).
Sure. For example,
yield 'From: has a malformed address'
The "words" we generate can be any strings whatsover. But you knew that.
>>> SPAM: * 1.7 -- Message-Id has no @ sign
>> Ditto.
> Again, maybe we could generate tokens based on the form of the
> Message-Ids. Common MTAs and MTUs have specific ways of generating
> message ids.
Ditto.
>>> SPAM: * 1.6 -- Invalid Date: header (not RFC 2822)
>> Ditto.
>>> SPAM: * 1.2 -- Message-Id is not valid, according to RFC 2822
>> Ditto.
> Same as above.
Ditto. If you want to tackle this, that would be great, and I expect these
are pieces of evidence we could enable regardless of corpus source. Very
brief msgs are darned hard to score, and I expect there's a world of info in
the headers I'm not getting at now.