[Spambayes] Re: There Can Be Only One

Tim Peters tim.one@comcast.net
Thu, 26 Sep 2002 15:51:10 -0400


>>> SPAM: *  2.2 -- From: has a malformed address

>> We have no code to catch that.

[Neil]
> Perhaps we could generate a token based on the form of the To and From
> addresses (e.g. if they include a real name and are valid based on
> RFCs).

Sure.  For example,

    yield 'From: has a malformed address'

The "words" we generate can be any strings whatsover.  But you knew that.

>>> SPAM: *  1.7 -- Message-Id has no @ sign

>> Ditto.

> Again, maybe we could generate tokens based on the form of the
> Message-Ids.  Common MTAs and MTUs have specific ways of generating
> message ids.

Ditto.

>>> SPAM: *  1.6 -- Invalid Date: header (not RFC 2822)

>> Ditto.

>>> SPAM: *  1.2 -- Message-Id is not valid, according to RFC 2822

>> Ditto.

> Same as above.

Ditto.  If you want to tackle this, that would be great, and I expect these
are pieces of evidence we could enable regardless of corpus source.  Very
brief msgs are darned hard to score, and I expect there's a world of info in
the headers I'm not getting at now.