[Spambayes] Suggestion for HTML analysis
Tom Bates
tfbiv at comcast.net
Sun Sep 14 17:48:39 EDT 2003
I'm new to the list. I hope this topic hasn't already been beat to death,
but recently I've gotten HTML-formatted spam that attempts to circumvent
recognition by inserting copious amounts of HTML garbage tags between
letters, like so (an actual sample):
Co<!an zsayjpoa dlweabk sni o
hgmysios i gdkqfwvin da byn
wkt pt g py wd
k!>nsoli<!wuis me l rj mrdc
ebsi vhviyrz
auu xxq tp
ffpmsck wklzmuyvtb tg u lhk cqny rm
r
yb!>dat<!w j t i
b qdsg
bm
jhj
qyjq gbbbej eu
pf
chlhqj sedz g stb p mbjo ned ybssswbv yg!>ion
All this just to spell the word "Consolidation" without detection. I think
Spambayes is fooled by this technique, because I don't see any of the
operative words in the analysis. Could Spambayes look for an opening <body>
tag and go into HTML detection mode?
Spambayes is working very well for me. About 40% of spam goes into the
"maybe" bucket at this point, and that percentage seems to be slowl
improving.
Thanks
Tom
More information about the Spambayes
mailing list