[Spambayes] Suggestion for HTML analysis

Tom Bates tfbiv at comcast.net
Sun Sep 14 17:48:39 EDT 2003


I'm new to the list. I hope this topic hasn't already been beat to death,
but recently I've gotten HTML-formatted spam that attempts to circumvent
recognition by inserting copious amounts of HTML garbage tags between
letters, like so (an actual sample):

Co<!an zsayjpoa dlweabk  sni o
hgmysios i gdkqfwvin da  byn
wkt pt g    py wd
k!>nsoli<!wuis me l rj mrdc
ebsi vhviyrz
auu xxq tp
 ffpmsck wklzmuyvtb tg u lhk cqny rm
r
yb!>dat<!w j t i
b qdsg
bm

jhj
qyjq gbbbej eu
pf
 chlhqj  sedz g stb p mbjo ned ybssswbv yg!>ion

All this just to spell the word "Consolidation" without detection. I think
Spambayes is fooled by this technique, because I don't see any of the
operative words in the analysis. Could Spambayes look for an opening <body>
tag and go into HTML detection mode?

Spambayes is working very well for me. About 40% of spam goes into the
"maybe" bucket at this point, and that percentage seems to be slowl
improving.

Thanks
Tom


More information about the Spambayes mailing list