[Spambayes] Suggestion for HTML analysis

wsy at merl.com wsy at merl.com
Mon Sep 15 07:12:20 EDT 2003


   From: "Tom Bates" <tfbiv at comcast.net>

   I'm new to the list. I hope this topic hasn't already been beat to death,
   but recently I've gotten HTML-formatted spam that attempts to circumvent
   recognition by inserting copious amounts of HTML garbage tags between
   letters, like so (an actual sample):

   Co<!an zsayjpoa dlweabk  sni o
   hgmysios i gdkqfwvin da  byn
   wkt pt g    py wd
   k!>nsoli<!wuis me l rj mrdc
   ebsi vhviyrz
   auu xxq tp
    ffpmsck wklzmuyvtb tg u lhk cqny rm
   r
   yb!>dat<!w j t i
   b qdsg
   bm

   jhj
   qyjq gbbbej eu
   pf
    chlhqj  sedz g stb p mbjo ned ybssswbv yg!>ion

Removing spurious HTML comments is one of the two things that 
CRM114 Mailfilter does that isn't totally learned-behavior (the
other is popping open base64's)

It helps a lot.  SpamBayes should at least consider it.

   -Bill Yerazunis



More information about the Spambayes mailing list