[Spambayes] Another software in the field
Justin Mason
jm@jmason.org
Tue Nov 19 14:14:11 2002
(a bit late in replying! I suffered from inbox overload ;)
T. Alexander Popiel said:
> If the received parser were a little smarter about parsing iPlanet
> received lines, it would have "pcp736393pcs.reston01.va.comcast.net"
> instead of "cj569191b" as the first element in the sequence, and
> the match list would have been 2 -> 1 -> 2 -> 0 -> 0, yielding:
>
> message-id-generation:skipped 0
>
> I suspect that high skipped numbers would be a strong spam indicator,
> howing where message ids were omitted in the sent mail and/or received
> headers naively forged to prevent backtracking.
It would be interesting to test this; we do something similar in
SpamAssassin to find possibly-forged hostnames in the Received
headers, and we do try to figure out where in the Received chain
the Message-id was added.
Two problems we've seen:
- some totally-legit senders, especially auto-generated mails, have a
bad habit of leaving out the Message-Id until it gets to *your* MX.
Annoying, but allowed by the RFCs. This test would have to figure
this out in some way; maybe by adding the sender's hostname or domain
to the token, so the legit folks gain ham hits, but spammers remain
as 1-spam 0-ham hapaxes?
- some senders use e.g. hostname "mylittlecompany.com" on their desktop
machine or home LAN, then connect via a commodity-DSL connection,
resulting in a reverse-lookup of "dsl43-234.bigisp.net". In other
words, the rDNS does not match what the sender wishes it did ;)
Not a problem in this case, but worth noting when talking about
Received-header parsing.
--j.
More information about the Spambayes
mailing list